Journal of Applied Psychology 


^ 


KENNETH E. Crank, Editor 
University of Colorado 


Consulting Editors 


| ARTHUR BRAYFIELD, Pennsylvania State Cuirronp E. JURGEN 
University Company 
GEORGE E. BricGs, Ohio State University 


SEN, Minneapolis Gas 

* 
Laurence S. McGavucuran, University of 
RMAN FREDERIKSEN, Educational Testing Houston 
Service 

5 iversity of Iowa 
KARD De vong Cniauersi o of Haroun F. Rorug, Fairbanks, Morse and 
win R. Henry, Standard Oil Company of ~ Company 

rs 2n ~ M s 

AB Seriy " -D Tuomas A. Ryan, Cornell University 
N Hottanp, Natianal Merit Scholarship 


Corporation t M A. C. WzLen, University of New Mexico * 
. * 


* La 
“Quinn McNemar, Stanford University 


È ` 


Volume 45, 1961 


{ 
CONS 
ArrHur C. Horrmay, Production Manager 


HELEN Orr, Promotion Manager 


Published bimonthly by the American Psychological Association, Inc. 
Prince and Lemon Sts., Lancaster, Pa, and, 1333 16th St. N.W., 
Washington 6, D. C. j 


E g "e 


Second-class postage paid at Lancaster, Pa, 


© 1961 by the American Psychological Association, Jne, 


* 
" e 


Contents of Volume 45 


Albright, Lewis E., and Glennon, J. R. Personal History Correlates of Physical Scientists! Career 
"ASDIFHÉUTIS nece ce cote GTA GRE sgain enim wore UA ANE Hacer ve qe Ee ca cx cx 281 

Albright, Lewis E. See Smith, Wallace J[. — — 777070 fv e ft e e e em di 

Appel, Valentine. See Blum, Milton L. 

Bachner, Virginia M. See Buel, William D. 

Barrett, Richard S. See Katzell, Raymond A. 

Bass, Bernard M. Some Aspects of Attempted, Successful, and Effective Leadership....... 120 

Bauer, Herbert J. See Olson, Pauli, — T T A Sa Seem mmasai e 2 

Bechtoldt, Harold P. See Crites, John O. 

Behringer, Richard D. See Ziller, Robert C. 


Bell, C. R. Psychological versus Sociological Variables in Studies of Volunteer Bias inSurveys S0 
Berry, Paul C. Effect of Colored Illumination upon Perceived Temperabiwe. 5 aoc. ce sie. ls 248 
Blum, Milton L., and Appel, Valentine. Consumer versus Management Reaction in New Package 5 
Developmentins aana suave wus idis ec exea dpa RU EGIT reU cc ne 222 
Bolster, B. I., and Springbett, B. M. The Reaction of Interviewers to Favorable and Unfavorable = 
Infortatlon., se vo pis au sue saamaa miegi pasas Ie Wi A alte aie wa sate ol ener Eades sa cc os 97 


Bottenberg, Robert A. See Harding, Francis D. 

Bottenberg, Robert A. See Marks, Melvin R. 

Boyd, J. B. Interests of Engineers Related to Turnover, Selection, and Management... 143 

Brainard, Robert W., Campbell, Richard J., and Elkin, Edwin H. Design and Interpretability of 
Rond SIpni cocco ns mannion bt FHIRSGINER rrr p qiie ax 


Buel, William D., and Bachner, Virginia M. The Assessment of Creativity i z 
, ol ae y in a Research Setting... 353 
Burg, Albert, and Hulbert, Slade. Dynamic Visual Acuity as Related to Age, S i ire. 
‘Callis, Robert. See Filbeck, Robert W. PASSE And Gtsücdienity, dnd 
Campbell, Richard J. See Brainard, Robert W. 
Chisman, James A., and Simon, J. Richard. Protection against Impulse-T- i i 
Utilizing the Acoustic Reflex...................0...00......, : veto ati ak SM NS RN 402 


Christal, Raymond E. See Marks, Melvin R. 
isty, R. T. See Kidd, J. S. 
Clark, Kenneth E. Donald G. Paterson: 1892-1961..,..,.,., a. 351 
Clark, R. Ernest. The Limiting Hand Skin Temperature for Unaffected Manual Performance in the ie 
Cold: a «ta ostia eager te qudm V eel ins eaa sis Pei Doni pi: è 
Clarke, V See Merenda, Peter E. — — ^ — EE Be sm em em esa 193 
Cline, Victor B., and Richards, James M., Jr. A Comparison of Individuals i A 
Personality... vss se cw cos ge at af 15 8K te eee an ere essa vee. ege Koneva Ores Judging 150 
Coleman, Edmund B., and Kim, Insup. Comparison of Several St les of Ty; jography in Engl ish P 
Colver, Robert M., and Spielberger, Charles D. Further Evidence of a Pastor iier a "oed 
Analogies Test... .... eee eee cece eee cece cee cence eee uiii. sha 9 
Cook, Kenneth G. See Robinson, John E., Jr. — 0 777077777770 e e e e e e e eee 126 
Crites, John O. Factor Analytic Definitions of Vocational Motivation 330 
ayo dons te 33 


Crites, John O., Bechtoldt, Harold P., Goodstein, Leonard D., and Heilbrun 
Analysis of the California Psychological Inventory A 

Davies, Barbara L. See Morrill, Charles S. 

Dawson, Robert I. See Kriedt, Philip H. 


Drewes, Donald W. Development and Validation of Syntheti i 
Motion Analysis...................0............ A: HEN ae Tents BAREA o NM 7 
Droege, Robert C., and Hill, Beatrice M. C isin Gf Dérfornense As WE CO Qe We 
Typewriters.. l s. rei M Performance on Manual and Electric 
RARIUS MIA: sed ieu exei TA 268 


Dunnette, Marvin. D. Driver Opinions and Reported Performa 
Marking and Nighttime Visibility Conditions. ........... 


Egeth, Howard E. See Wells, William D. ——— 0 077700066686 nes 170 
Eliseo, Thomas Stephan. See Weisskopf-Joelson, Edith. 
Elkin, Edwin H. See Brainard, Robert W. 
Evans, Richard I. A Psychological Investigation of a Group of D i " 
havioral Variables as They Relate to Viewing Educational federation ae Personality, and Be- 
Sen Dire acral CICER 25 


Evans, Richard L, Weiland, Betty A., and Moore, Charles W. The Eff 
courses on Attitudes toward Instruction by Television and Impact of 
Programs. c eisnicseeeten e pue ee ene ee vernm le Eea se LL 


ect of Experience in Tele- 
a Controversial Television 


iv Contents of Volume 45 


Festinger, Leon. See Schachter, Stanley. 
Filbeck, Robert W., and Callis, Robert. 


A Verification Scale for the Strong Vocational Interest 
Blank, Men’s Form 


ak ve dad 318 
Fine, Bernard J. The Effect of Exposure to an Extreme Stimulus on Judgments of Some Stimulus- 
Related Words, pos nasai pat Sireen sepe eren ln DA p ME 117 
Fine, Bernard J. Welsh’s Internalization Ratio as a Behavior Index........., 
Fleishman, Edwin A. See Parker, James F., Jr. 
Ford, John D., Jr. See Rogers, Miles S. 


The Administrative Judgment Test as Related to 95 
Descriptions of Executive Judgment Behaviors 


NL 

Ghiselli, Edwin E. See Wiest, William M. 

Glennon, J. R. See Albright, Lewis E. 

Glennon, J. R. See Smith, Wallace is " 

ood Leonard D., and Kirk, Barbara. A Six-Year Follow-up Study of Graduate Students in 240 
Public Health SIRs ia x pendisse ci dit aerae co erem T T 

Goodstein, Leonard D. See Crites, John O. 

Gordon, Leonard V. See Stern, Ferdinand. 3 in 

Grothe, Hilde, and Lyman, John. A Hierarchy of “Perceptual Usefulness” of Geometrical Cues 
an Overlearned Dial-Reading Task....... 0 

Guetzkow, Harold. See Forehand, Garlie A, 

Hall, Charles E. See Merenda, Peter F. 

Harding, Francis D., and Bottenberg, Robert A, 


86 


428. 
between Attitudes and Job Performanee.............. S ; 123 
Harrison, Roger. Cumulative Communality Cluster Analysis of Workers’ Job Attitudes 

Heilbrun, Alfred B., Jr. See Crites, John O. 


Hill, Beatrice M. See Droege, Robert C, 
Hoffman, L. Richard. See Maier, Norman R. F. 
Hollenbeck, G. P, See Bridgman, C. S, . 364 
Hughes, Charles L. Variability of Stroke Width within DIBHS- iet aane die obdné e E DIN PITE 
Hughes, J. L., and McNamara, W. 
Instruction in Industry... 
Hulbert, Slade. See Burg, Albert. 
Hyman, Ray. See Schachter, Stanley. 
Jackson, Jay M. 
Jansen, Mathilda J. See Ziller, Robert C. 
Jourard, Sidney M. 
Kaess, Walter A., Witryol, S; 
` in the Leaderless Group 
Katzell, Raymond A., Barre 
ance, and Situational Characteristics 
Kennedy, James E. The Paired-Com, 


parison Method and Central Tendency Effect in Esthetic id 
xd el eet CRI I poor d Gonka! Ee in. 2i ie 
Kidd, J. S. A Comparison of One-, Two-, and Three-Man Work Units under Various Conditions 
of Work Load 


Training ina Complex Task by Means of Task Simulation 
Kidd, J. S., and Christy, ocedures and Work-Team Productivity,............ 
Kim, Insup. See Coleman, Edmund B. 
Kirchner, Wayne K. “Real-Life” Faking on the Stron 
Kirk, Barbara A. See Goodstein, Leonard D. 


Kjeldergaard, Paul M. Attitudes towards Newscasters as Measured by the Semantic Differential: 
Descriptive Case 


19 
HOMIES: vif Conipaxison hic aioe Ros ea ae n win wikia aes ayn hme 5 
Kidd, J.S. A Comparison of Two Methods of 5t 


E Vocational Interest Blank by Sales Applicants 273 


T 
Baap, Robert R, Objective Personality Test and Sociometric Correlates of Frequency of Sick Bay m 
A DUE eurer cece erences ere late of Braqnaney of Biek Bi 
Kriedt, Philip H., and Dawson, Robert I. Response Set and the Prediction of Clerical Job Per- 5 
Ud a tet thia Prodiction of leis Job ee 17 
Lodahl, Thomas M., and Porter, Lyman W. Psychometric Score Patterns, Social Characteristics, and 73 
Productivity of Small Industrial Work E a QN 1 
Lyman, John. See Grothe, Hilde, 
McBain, William N. Noise, the “Arousal Hypothesis,” and Monotonous Wd a ea EIE 309 ( 
McCormick, Ernest J. See Palmer, George J., Jr. 
McNamara, W, J. See Hughes, J. L. 
Maier, Norman R. F., and Hoffman, T.. Richard. Organization and Creative Problem Solving...... 271 


T 


Contents of Volume 45 v 


Marks, Melvin R., Christal, Raymond E., and Bottenberg, Robert A. Simple Formula Aids for 
Understanding the Joint Action of Two Predictors......................... — E 

Mausner, Bernard, Situational Effects on a Projective Test 

Merenda, Peter F., Clarke, Walter V., and Hall, Charles E. Cross-Validity of Procedures for Selecting 
E A E EE E R CE 376 

Meyer, Herbert H., and Walker, William B. Need for Achievement and Risk Preferences as They 
Relate to Attitudes toward Reward Systems and Performance Appraisal in an Industrial Setting.. 251 

Minor, Frank J., and Revesman, Stanley L. Experimental Evaluation of Binary Codes for Console 
Display 

Moore, Charles W. See Evans, Richard I. 

Morrill, Charles S., and Davies, Barbara L. Target Tracking and Acquisition in Three Dimensions 
Üging'a Two-Dimensional Display Surface... 0.0 io. esis es cec vp anas S EA Hee Rx eek nens 214 

Muckler, F. A., See Obermayer, R. W. 

Nolan, Richard E. See Kaess, Walter A. 

Nye, Charles T. See Rothe, Harold F. 

Obermayer, R. W., Swartz, W. F., and Muckler, F. A. The Interaction of Information Displays with 


285 
186 


Control System Dynamics in Continuous Tracking... 0.0.00... el e cc sess 369 
Olson, Paul L., Wachsler, Robert A., and Bauer, Herbert J. Driver Judgments of Relative Car 
DIJDICU MN TE LE ORTUS AMNEM UN 161 


Owens, William A. See Smith, Wallace J. 
Painter, John J. See Tucker, W. T. 


Palmer, George J., Jr., and McCormick, Ernest J. A Factor Analysis of Job Activities............. 289 
Parker, James F., Jr., and Fleishman, Edwin A. Use of Analytical Information concerning Task 
Requirements to Increase the Effectiveness of Skill Training................................. 295 


Parker, Treadway C. See Katzell, Raymond A. 
Porter, Lyman W. rem Trait Requirements in Bottom and Middle Management Jobs 
Porter, Lyman W. A Study of Perceived Need Satisfactions in Bottom and Middl 

Porter, Lyman W. See Lodahl, Thomas M. e Management Jobs 1 

Porter, Lyman W. See Wiest, William M. 

Promisel, David M. Visual Target Location as a Function of Number and Kind of C i i 4 

Ratz; I Cand Ritchie, D.K., Operator Peromanes.on a Chord Keyboard. te 420 

Revésman, Stanley L. See Minor, Frank J. 

Richards, James M., Jr. See Cline, Victor B. 

Ritchie, D. K. See Ratz, H. C. 

Robinson, John E., Jr., Cook, Kenneth G., and Zeleny, Charles E. Pilot Judgments of Simulated 
Collisions and Near Misses: A Comparison of Performance with Uncoded and Two-Tone Coded 
a O E E TEET Eso 

Rogers, Miles S., Ford, John D., Jr., and Tas 
an Information-Processing Crew... setea erain aiiin rina ET sess. 91 

Rosen, Hjalmar. Desirable Attributes of Work: Four Levels of Management Describe Thei- Toh 


y E 5 
Rosen, Hjalmar. Managerial Role Interaction: A Study of Three Managerial Levels 156 


CODE HEP iaasa Dada erste as anin endean ERE 50 

"man, Ray. Emotional Disruptio; 
Industrial Productivity... r i 5 T i a T veia EL E A: ption and adi 

Schultz, Douglas G., and Siegel, Arthur I. Generalized Thurstone and Guttman Scales f M ee 2 
Technical Skills in Jab Perini, atse irama eria eire sig leone or Measuring 


Siegel, Arthur I. See Schultz, Douglas G. — 0 0 0070777777 137 
Simon, J. Richard. See Chisman, James A. 
Smith, Wallace J., Albright, Lewis E., Glennon, J. R., and Owens, Willi 
, hat dox » William A. icti 

Research Competence and Creativity from Personal History.......... á ana "ue The Prediction of 59 
Spielberger, Charles D. See Colver, Robert M. eon ke * 
Springbett, B. M. See Bolster, B. I. 
Stephenson, Dig agd R Chance versus Nonchance Scores on the SVIB 415 
Stern, Ferdinand, and Gordon, Leonard V. Ability to Follow Instructions as a Dod WM OS es xo ut page 

Recruit Training pe Kw os E oci ARMES a Predictor of Success in 
warte Wik, SeeObéfmayem RUW: 0 ni ten tenga 22 
Sydiaha, Daniel. Bales’ Interaction Process Analysis of Personnel S i i 
Tassone, Jack A. See Rogers, Miles S. viesHon Tatoo... ven ean E 
Trumbo, Don A, Individual and Group Correlates of At/itudes toward Work-R. 
Tucker, W. T., and Painter, John J. Personality and Product Use... Saa Change...... 2s 


Wachsler, Robert A. See Olson, Paul L. 


vi 


Walker, William B. 


Contents of Volume 45 


See Meyer, Herbert H. 


Walther, Regis H. See ien as a Predictor of Success or Failure in Foreign Service Clerical 


Jobs.. 
Weiland, Betty A 
Weisskopf-Joelson, 
of Brainstorming. 


Wells, William D., Egeth Howard 


Short Neuroticism 
Wiest, William M., P 
ficiency and Team 


vans, Richard I. 
ith, and Eliseo, Thomas Stephan. An Experimental Study of the Effectiveness 


+ and Wray 
and Extraversion Scales. . 
orter, Lyman W., and G hiselli, Edwin E. Relations between Individual Pro- 
Performance and Team Efficiency 


Willerman, Ben. Sce Schachter, Stanley. 
Witryol, Sam L. See Kaess, Walter A. 
Wray, Nancy P. See Wells, William D. 


Zeleny, Charles 


See Robinson, John E., Jr, 


Ziller, Robert C., Behringer, Ric hard D... and Jansen, Mathilda J. The Newcomer in Open' and 


Closed Groups.... 


435 


5 


— 


= 


Sg À —— :aÁ 


—'W——R— ———— € 


Journal of Applied Psychology 


Vor. 45, No. 1 


FEBRUARY 1961 


A STUDY OF PERCEIVED NEED SATISFACTIONS 
IN BOTTOM AND MIDDLE MANAGEMENT JOBS: 


" LYMAN W. PORTER ? 


University of California, Berkeley 


Studies of management jobs in industry and 
business have in the past tended to concen- 
trate on the technical aspects of these jobs, 
such as lists of duties, responsibilities, func- 
tions or activities performed, or on the per- 
sonality traits of individuals filling. particular 
jobs. Except for such recent investigations as 
those by Triandis (1959a, 1959b), relatively 
few studies have been concerned with how the 
psychological characteristics of management 
jobs are perceived by the individuals in the 
jobs. An understanding of the nature of job 
perceptions held by individuals in manage- 
ment positions would seem to be appropriate 
or the study of organizational problems. In 
many cases, for example, individuals are pro- 
moted within management on the basis of 
heir technical qualifications for a job, while 
heir performance may in large part depend 
on how well they are able to adjust to such 
psychological aspects of the job as the types 
of motivational rewards received, the various 
pressures encountered, and the perceived ex- 
pectations of superiors and subordinates. In 
short, some individuals may be qualified tech- 
nically for particular management jobs, but 
do not fit the psychological nature of the jobs. 
If management and the individuals them- 
selves Knew more about the psychological as- 
pects of jobs, and of the differences between 


1 iz y was begun as part of the research 
eee cp Institute of Industrial Relations at 
the University of California, Berkeley. It was con- 
tinued under a fellowship granted by the Ford Foun- 
k 4 assistance of Robert Andrews and Mildred 
M. Henry is gratefully acknowledged. The Institute 
of Social Sciences, University of California, Berkeley, 
“ntributed to the support of the research assistance. 


jobs at different levels or in different parts of 
the organization, promotional and other per- 
sonnel errors might be reduced and organiza- 
tional effectiveness thereby increased. 

The present investigation is concerned with 
one area of management job perceptions, 
namely, need satisfactions. Although a num- 
ber of studies have been carried out on the 
need satisfactions connected with various 
types of workers’ jobs, very few have con- 
centrated on management jobs, particularly 
where the level within the management hier- 
archy has been taken into account. As Haire 
(1959) has pointed out in a recent review 
of psychological problems relevant to busi- 
ness and industry, *one area of motivational 
studies is surprisingly lacking—the study of 
motivation in management . . . motivational 
analysis has not included the various levels 
of management" (p. 187). Studies of motiva- 
tion at any level of the organization have had 
to face the problem of how to name and clas- 
sify various motives and needs. One of the 
most useful systems for the psychologist who 
is studying motivation in the industrial situa- 
tion is the grouping of motives or needs ac- 
cording to a hierarchy of prepotency (e.g., 
Maslow, 1943, 1954). Such a theory states, 
in essence, that there 


are basic or primary 
needs, such as those 


for food, water, and 
sleep, that an individual satises (at least 
minimally) first, after which he turns to so- 
called higher-order needs, such as those for 
affiliation, nurturance, and esteem, Finally, if 
the individual has achieved some degree of 
satisfaction of these first-order and middle- 
order needs, he may then spend effort on try- 
ing to satisfy the highest-order need, that of 


Lyman W. Porter 


TABLE 1 


CHARACTERISTICS OF SAMPLE BY Many 


MENT LEVEL 


Characteristics 


Management Levels 
Bottom 
Management 


Middle n 
Management Total 


N in Obtained Sample 
Potential N Available 

51, Obtained N of Available V 
Median Age (years) 

Median Seniority (years) 
Educational Level: 


% having education beyond high school 


B, and C.* Briefly, the companies may be described 
as follows: 

Company A: This company is a large, nationwide 
organization manufacturing consumer container prod- 
ucts. One plant of this company was sampled; this 
plant employs over 1,000 workers and has 75 indi- 
viduals classified as "management." The management 
response rate was 53% from this company. 

Company B: This company is also a nationwide 
concern. It processes and distributes (with an em- 
phasis on the latter function) a type of food prod- 
uct. One of the two geographical divisions of this 
organization was sampled for this study. This divi- 
sion employs approximately 400 individuals, of which 
76 are classified as "management" (including sales 
supervisory personnel). The response rate from man- 
agement in Company B was 70%, 

Company C: This com 
utility firm. Tw. 
taining a total 
ment personnel 
this study. For 
60%, 


pany is a medium-sized 
o divisions of this corporation, con- 
of some 600 workers and 77 manage- 
» Were used to obtain a sample for 
Company C, the response rate was 


Table 1 presents the characteristics of the obtained 
sample for the three companies combined. This table 
shows that a somewhat higher percentage of middle- 
management individuals responded to the question- 
naire instrument than did those in bottom manage- 
ment. This was probably due in part to the relatively 
higher educational level of the middle managers and 
their consequent ability and willingness to deal with 
written material, Table 1 also shows, however, that 
there was very little difference between the two 


managerial groups of respondents in either median 
age or median seniority, 


Procedure 


The questionnaires wer 
all of the members of m 
either by company 


2 The author wishes to thank the companies that 
agreed to participate in this study, and their man- 
agement personnel who supplied the basic data. 


€ distributed individually to 
anagement in each company, 
Or by United States mail. Ac- 


64 19 139 
121 107 228 
52.9 70.1 61.0 
43.5 43.8 43.7 
15.8 10.0 15.9 
43.8 70.7 58.0 


companying the questionnaire was a memorandu 
from the chief company officer in each plant or S ‘a 
sion explaining that the company had been AS. 
participate in a research project on pangganan 
and that the company had agreed to coopera BE, 
this memorandum from the company it was mene 
clear that no individual was required to perd e 
questionnaire, but each was strongly urged to os 
The memorandum stated that the qqueshon naire d 
to be filled out anonymously and that individual ait 
sponses would not be made available to the e 
pany. Along with the questionnaires and the ei 
pany memorandum, cach respondent receiv sae, 
stamped, self-addressed envelope in which to re dl 
his completed questionnaire directly to the em. 
gator at the university. This method of gaie the 
was used not only to facilitate the mechanics © a 
process, but also to emphasize that this was bee 
search project being carried out at a university / 
was not a company-sponsored study. 


RESULTS 


Perceived Deficiencies in Need Fulfillment 


"Wu" 

Table 2 presents data concerned with a 
differences between perceived amount of a 
ent fulfillment of needs and the amount a 
fulfillment Ss believed should be availa y 
their positions. The basic data for the zd 
were obtained by ascertaining for each 5 a 
each questionnaire item whether the individu? 
checked a higher scale-number for Part b do 
the item (“How much of the characteristic t 
you think should be connected with your Mie 
agement position?") than for Part a of ©, 
item (*How much of the characteristC 
there now connected with your managen? 
position?”). Whenever Part b was check 
higher than Part a, this was termed a ^ 
ficiency” in need fulfillment, 


Need Satisfactions 


ot 


in Management Jobs 


E TABLE 2 
DIFFERENCES BETWEEN MANAGEMENT LEVELS IN PERCENTAGE OF Ss 


INDICATING NEED-FULFILLMENT DEFICIENCIES 


% 96 
Bottom Middle 
Mgmt. Mgmt. % p Value p Value for 
Questionnaire Items? (VY = 64) (N = 75) Difference for Items Categories? 
Security needs 
Item I-a 42.2 26.7 15.5 .06 .06 
Social needs 
* Item I-a 37.5 33.3 04.2 
II-b 32.8 30.7 02.1 
Average for category 35.2 32.0 03.2 
Esteem needs 
Ttem II-a 62.5 42.7 19.8 02 
IIl-b 71.9 45.3 26.6 .002 
llc 31.2 18.7 12.5 09 
Average for category 55.2 35.6 19.6 .005 
Autonomy needs 
Item IV-a 59.4 46.7 12.7 
IV-b 60.9 44.0 16.9 .05 
IV-c 62.5 46.7 15.8 .06 
IV-d 57.8 53.3 04.5 
Average for category 60.2 47.7 12.5 02 
Self-actualization needs 
Item V-a 48.4 49.3 —00.9 
V-b 64.1 50.7 13.4 
V-c 67.2 60.0 07.2 
Average for category 59.9 53.3 06.6 
“Nonspecific” items 
Item VI 79.7 80.0 —00.3 
78.1 61.3 16.8 .03 


Item VII 


a For complete wording of items ri 
* For method of computation of p 


In Table 2, need fulfillment deficiencies in 
bottom-level management positions are com- 
pared with those in middle-level positions for 
each of the 15 questionnaire items (arranged 
by need category), and for the five categories 
themselves. The percentages for items were 
computed by taking all of the individuals at 
each management level, without regard to 
company affiliation, and ascertaining the num- 
ber who checked a deficiency. Confidence lev- 
els of differences between the percentages for 
the two management levels on each of the 
15 items were determined by the standard 
methoa of computing the significance of dif- 
ferences between percentages. The resulting p 


er to text in Method section. " 
values for categories with more than one item, refer to text in Results section, 


values for the iiems are given in the fourth 
column in the table. Percentages for cate- 
gories were computed by averaging the per- 
centages of the individual items in the cate- 
gories. (It should be noted that such averages 
are not necessarily based on a representative 
sampling of possible items within each cate- 
gory, and the obtained average-for-category 
values therefore should be interpreted with 
caution.) Because the category averages are 
based on extremely small Ns of items, the 
standard method of computing significance 
levels of differences between averages could 
not be applied. Therefore, ‘to obtain levels of 
significance for category differences, the fol- 


Lyman W. Porter 
6 


lowing procedure was used: for each need 
6 individual received a “score 
category ‘eae dn ficiencies checked 
based on the number of deficien qos 
for the items in a particular Pera : 
for the esteem need category, for example, an 
individual could get a score of 0-3, Sce 
there were three items in this category. A dis- 
tribution of scores was drawn up for each 
category, using the total number of Ss with- 
out regard to management level or company, 
and a dichotomy was made as near to the 
median as possible. The percentages of scores 
above the median split (“high scores") were 
then used in the standard formula for the sig- 
nificance of differences between percentages. 
The resulting ? values for the categories are 
presented in the fifth column of Table 2. 

Table 2 shows, first, that except for Items 
V-a and VI, where the percentages were vir- 
tually equal, deficiencies were more frequently 
indicated for bottom- than for middle-man- 
agement positions. Items approaching or ex- 
ceeding statistical significance were ones con- 
cerning feeling of security, feeling of self- 
esteem, prestige inside the company, prestige 
outside the company, opportunity for inde- 
pendent thought and action, opportunity for 
participation in the setting of goals, and a 
feeling of being informed. These items that 
differentiated between the two management 
levels were, therefore, concentrated in the se- 
curity, esteem, and autonomy categories. 

Secondly, Table 2, as explained above, per- 
mits a direct comparison between the two 
levels of management for each of the five 
categories as a whole, as well as for the in- 
dividual items in the categories. The last 
column in Table 2 shows clearly that the se- 
curity, esteem, and autonomy categories pro- 
duced the greatest differences between bottom 
and middle management with regard to need 
fulfillment deficiencies, For the other two 
categories of social and self-actualization 
needs, the differences were in the direction of 
more frequent deficiencies in bottom manage- 
ment, but neither category had differences ap- 
proaching statistical significance, 

The trends evident in Table 2, on the dif- 
ferences between the two management levels 
in regard to need fulfillment deficiencies, are 
present not only when the data are combined 
from all three companies, as in Table 2, but 


also when the data are analyzed company by 
company. Thus, in each of the three indi- 
vidual companies, bottom-management indi- 
viduals more often indicated deficiencies than 
did middle-management personnel, with the 
esteem need area producing the largest differ- 
ence between the two management levels. The 
areas of security and autonomy tended S 
produce the next largest differences between 
the two management levels in all three COM 
;anies. 

TA tiled finding evident in Table 2 is that 
there was a trend in both bottom- and y 
dle-management levels for the perceived 
ficiencies to be indicated more frequent M 
the higher-order need categories (autonsa 
and self-actualization) than for the lota 
order need categories (security and so 
needs). Only in the bottom-management E 
was there a slight reversal, where the secur 
need item had a higher percentage thanka 
average for the two social need items. E. 
ever, both the security and social catego 
in both levels of management had sae, i 
erages than did the higher-order atone 
For the particular items and need catego 


te 
, used in this study, there appear to be grea 


deficiencies in need fulfillment in the a 
called higher-order needs, both in et. A 
and in middle-management positions. AB 
this trend is clearly present in the sepat“ i 
data for each company as well as in the co' 
bined data from all three companies. 


Importance of Needs 


0 
Table 3 presents data on the inportan 
the various types of needs to the individe 
in lower- and middle-management jobs. ^i; 
data in this table were based on Part c of is 
questionnaire items, “How important ae i 
position characteristic to you?” Table and 
comparable in construction to Table 2; A 
presents for each item and category the P t 
centage of individuals in the two manage, 
levels who checked “maximum” import 
(i.e., who check 7 on a 7-point scale). m 
Table 3 shows that in contrast to the E of 
ings on need fulfillment deficiencies, -— t 
the categories as a whole produced signi®“ p- 
differences between bottom and middle mill 
agement on the importance of need ÍU js 
ment. (The differences between categ? 


Need Satisfactions in Management Jobs 


- 


: TABLE 3 


DIFFERENCES BETWEEN MANAGEMENT LEVELS IN PERCENTAGE OF Ss 
CHECKING MAXIMUM IMPORTANCE OF NEEDS 


% % 
Bottom Middle . 
Mgmt. Mgmt. 96 b Value $ Value for 
Questionnaire Items* (N — 642 (N — 75) Diference for Items Categories 
Security needs 
Ttem I-a 67.2 58.7 08.5 
Social needs 
Ttem II-a 57.8 61.3 —03.5 
IL-b 35.9 29.3 06.6 
Average for category 46.8 45.3 01.5 
Esteem needs id 
Item I-a 42.2 36.0 06.2 
it III-b 35.9 25.3 10.6 
HI-c 26.6 22.7 03.9 
Average for category 34.9 28.0 06.9 
Autonomy needs 
Item IV-a 39.1 22.7 16.4 .03 
IV-b 51.6 46.7 04.9 
IV-c 43.8 38.7 05.1 
IV-d 51.6 34.7 16.9 04 
Average for category 46.5 35.7 10.8 
Self-actualization needs 
Item V-a 73.4 57.3 16.1 05 
V-b 60.9 57:3 03.6 
V-c 60.9 68.0 —07.1 
Average for category 65.1 60.9 04.2 
“Nonspecific” items 
E VI 65.6 44.0 21.6 01 
Item VII 46.9 41.3 05.6 


i items refer to text in Method section. » í 3 
sper Cor meee ner af e b value for categories with more than one item, refer to text in Results section. 
For metho o 


were analyzed by the same method as that 
used in connection with Table 2.) Four indi- 
vidual items—two autonomy items, one self- 
actualization item, and the pay item—did pro- 
duce significant differences between the two 
management levels. (Except for the pay item, 
the differences in the significant items were 
due almost entirely to relatively large differ- 
ences found in Company A. Respondents in 
the other two companies contributed very 
little to the size of the differences for these 
items.) . 

Table 3 also shows that in contrast to the 
Tesults on fulfillment deficiencies, the items 


and categories tending to produce the largest 
indication of maximum importance for both 
levels of management were not solely the 
higher-order needs. As is evident in Table 3, 
the two areas of greatest importance were a 
higher-order need, self-actualization, and a 
lower-order need, security. The other rela- 
tively higher-order needs, especially esteem, 
received the lowest average percentages of 
maximum importance in both management 
levels. Thus, when the importance of need 
satisfaction is considered, respondents in both 
the bottom- and middle-management levels 
did not show a clear trend of increase from 


8 


lower-order needs to higher-order needs, as 
was the case when deficiencies in need satis- 
faction were considered. The absence of this 
trend for importance is present not only when 
respondents from all three companies are com- 
bined, as in Table 3, but also when the data 
from each company are analyzed separately. 


DISCUSSION 


The results show that for the sample of in- 
dividuals and companies studied, lower-level 
management positions were more likely to 
produce deficiencies in fulfillment of psycho- 
logical needs than were middle-level positions. 
This suggests that there exists a differential 
opportunity within management to satisfy 
various motivational needs. The greatest dif- 
ferences between the two management levels 
occurred in a lower-order need area, security, 
and in two of the higher-order areas, esteem 
and autonomy. Contrary to expectation, mid- 
dle management was almost as dissatisfied as 
bottom management in the highest-order need 
area, self-actualization. In this area, only a 
relatively small difference was found between 
the two management levels. The differential 
opportunity within management to satisfy 
psychological needs appears to be much more 
prominent in areas other than self-actualiza- 
tion. 

It is essential to consider differences among 
need areas within management levels as well 
as the differences between management levels 
for given need areas. When this is done, the 
largest frequencies of perceived need fulfill- 
ment deficiency in both of the levels of man- 
agement were found to occur in the higher- 
order need areas, those of esteem, autonomy, 


Lyman W. Porter 


and selí-actualization. These are the same 
need areas that various writers have indicated 
are least satisfied in nonmanagement produc- 
tion positions. Thus, from the very bottom of 
organizations up through at least two-thirds 
of management, the higher-order needs are 
not being as well-satisfied as the lower-order 
needs. 

To assess the impact of need-fulfillment de- 
ficiencies, not only the size or amount of the 
deficiency must be taken into account, but 
also the importance of the particular € 
area to the individual involved. The releva? 
data that were obtained in this study shows 
no consistent over-all differences between © 
personnel in the two levels of management af 
how important they regarded fulfillment ° 
various types of psychological needs. i 
finding contrasted with that for need fulfi i 
ment deficiencies, where definite bottom. 
dle management differences were found in s 
eral areas of needs. Within both levels 2 
management, security and self-actualization 
the lowest-order need (of those studied) E. 
the highest-order need—were perceived as i 
most important of the five need categories * 
vestigated, whereas the analysis of need- f 
fillment deficiencies revealed that the greate 
perceived deficiencies occurred in the thri 
highest need-orders. . i 

Combining the results for deficiency bi 
those for importance presents a picture W "d 
is summarized in Table 4. (The terms we 
in the body of Table 4 represent the appro® 
mate relative degrees of deficiency and pi 
portance for the five need categories b 
each management level.) In both lower one 
middle management the most critical of 


TABLE 4 


Summary OF RELATIVE NEED-FULFILLMENT DEFICI 


vey AND NEED 


IMPORTANCE WITHIN MANAGEMENT LEVELS . e. 
Bottom Management Middle Management 
Relative Relative Relative Relati 
Need Categories Deficiency Importance Deficiency Importance p” 
Security moderate large small large 
Social small moderate small moderate 
Esteem large small small small 
Autonomy large moderate moderate small 
Self-Actualization large large large large 


Need Satisfactions in Management Jobs 9 


need hierarchy areas is the area of self-actuali- 
zation because individuals in both levels of 
management considered it to be of prime im- 
portance and the area where there were the 
greatest deficiencies in need fulfillment. The 
next most critical areas for bottom manage- 
ment are security, which was considered of 
major importance and to have relatively mod- 
erate deficiency in fulfillment, and autonomy, 
which had a relatively large deficiency and 
was seen as of moderate importance. Esteem 
and social needs appear to be the least criti- 
cal need areas for persons in bottom-manage- 
ment positions. The pattern for the combined 
deficiency and importance results in middle 
management is somewhat similar to the pat- 
tern in bottom management. Security ranks 
below self-actualization as a critical area for 
middle management because although the size 
of the perceived deficiency was relatively 
small for positions at that level, the impor- 
tance was relatively large. Even in middle 
management, however, security ranked above 
the autonomy, esteem, and social areas in the 
combined effect produced by deficiency and 
importance. 

Two of the items contained in the ques- 
tionnaire have so far been omitted from the 
discussion because they do not fit neatly into 
any of the five need categories specifically 
covered in this study. These are Items VI and 
VII, having to do with pay and with a feel- 
ing of being informed, respectively. These 
items have been considered separately because 
each seems to overlap into two or more need 
categories. The amount of pay one receives in 
his work would seem to satisfy both security 
and esteem needs, and is also a means of satis- 
fying primary physiological needs that could 
not be appropriately studied in this investiga- 
tion. Likewise, the need to be informed would 
seem to satisfy social, esteem, and perhaps 
autonomy needs. Even though neither of these 
items could be related to only one specific 
need fulfillment category, each is obviously 
important to consider since they are so inti- 
mately a part of a person's thoughts and feel- 
ings about the job and since they provide 
multiple psychological satisfaction. The pay 
item was found to have the highest frequency 
of responses indicating deficiency in fulfill- 
ment in both levels of management. It was 


regarded as having major importance by lower- 
management individuals, and relatively mod- 
erate importance by middle-management per- 
sonnel. Thus, pay, perhaps because it does 
satisfy several types of needs, is a crucial 
item for both lower-management and middle- 
management individuals. The feeling of being 
informed, as measured by Item VII in this 
study, produced the second most frequent in- 
dication of need fulfillment deficiency in both 
lower- and middle-management positions, al- 
though its frequency in middle management 
was significantly below that for the lower 
level. In terms of importance, it received 
relatively moderate support in both groups. 
Therefore. being "in" on plant and company 
information is also a key area in fulfillment 
of needs of personnel in lower management, 
and a moderately crucial area for middle 
management. 


SUMMARY AND CONCLUSIONS 


This study investigated perceptions of bot- 
tom- and middle-management jobs. Specif- 
ically studied were perceptions of need ful- 
fillment deficiencies and need importance. 
Five need areas were selected for investiga- 
tion because of their relevance to the concept 
of a hierarchy of prepotency of needs, and 
their relevance to management positions. The 
five need areas were: security, social, esteem, 
autonomy, and self-actualization. The percep- 
tions regarding these needs were obtained by 
the administration of a questionnaire to 64 
bottom-management and 75 middle-manage- 
ment individuals in three separate industrial 
organizations. 

The following are the major conclusions: 

1. The vertical location of management po- 
sitions appears to be an important variable in 
determining the extent to which psychological 
needs are fulfilled. 

2. The greatest differences in the frequency 
of need-fulfillment deficiencies between bot- 
tom- and middle-management positions occur 
in the esteem, security, and autonomy need 
areas. These needs are significantly more 
often satisfied in middle than in bottom man- 
agement. 

3. Higher-order psychological needs are 
relatively the least satisfied needs in both 
bottom and middle management. 


10 Lyman W. Porter 


4. Self-actualization and security are seen 
as more important areas of need satisfaction 
than the areas of social, esteem, and au- 
tonomy, by individuals in both bottom- and 
middle-management positions. 

5. The highest-order need of self-actualiza- 
tion is the most critical need area of those 
studied, in terms of both perceived deficiency 
in fulfillment and perceived importance to the 
individual, in both bottom and middle man- 
agement. This need is not perceived as sig- 
nificantly more satisfied at the middle-man- 


agement level than at the bottom-manage- 
ment level. 


REFERENCES 


Arcyris, C. Personality and organization. New York: 
Harper, 1957. 

Davis, K. Human relations in business. New York: 
McGraw-Hill, 1957. 

Hare, M. Psychology in management, New York: 
McGraw-Hill, 1956. 


Hamre, M. Psychological problems relevant to busi- 
ness and industry. Psychol. Bull, 1959, 56, 169- 
194. 

Leavitt, H. J. Managerial psychology. Chicago: 
Univer. Chicago Press, 1958. 

Mastow, A. H. A theory of human motivation. Psy- 
chol. Rev., 1943, 50, 370-396. 

Mastow, A. H. Motivation and personality. New 
York: Harper, 1954. 

SwurH, H. C. Psychology of industrial behavior. 
New York: McGraw-Hill, 1955. 

TRraNDrs, H. C. Categories of thought of managers, 
clerks, and workers about jobs and people in in- 
dustry. J. appl. Psychol., 1959, 43, 338-344. (a) 

Trranpis, H. C. Differential perception of certain jobs 
and people by managers, clerks, and workers in in- 
dustry. J. appl. Psychol., 1959, 43, 221-225. (b) 

Warrer, C. R., & Guest, R. H. The man on the 057 
sembly line. Cambridge: Harvard Univer. Press 
1952. 

VrreLEs, M. S. Motivation and morale in industry 
New York: Norton, 1953. 


(Received February 23, 1960) 


“~~~ 


^ 


Journal oj Applied Psychology 
1961, Vol. 45, No. 1, 11-15 


^THE EFFECT OF EXPERIENCE IN TELECOURSES ON 
ATTITUDES TOWARD INSTRUCTION BY TELEVISION 
AND IMPACT OF A CONTROVERSIAL TELEVISION 
PROGRAM' 


RICHARD I. EVANS, BETTY A. WIELAND, axp CHARLES W. MOORE 


University of Houston 


Research in noncommercial educational tele- 
vision (ETV) has been concerned with a num- 
ber of problems but, in general terms, two di- 
rections of such research may be suggested. 
One seems primarily involved with television 
as a medium of instruction; the other, with 
a wide range of aspects of general noncom- 
mercial educational programing, from such 
problems as identification of the general ETV 
audience to studies of impact of specific ETV 
programs. : , , 

The present study was designed to investi- 
gate problems stemming from both directions. 
More specifically, it dealt with the following 
general problems: 

1. What is the effect of a single program on 
viewer attitudes? Since, as suggested in an 
earlier study (Evans, 1957), the audience for 
educational programs is not likely to watch an 
entire series consistently but may often view 
only one program of a series, evaluation of 
any program in a series becomes 
important. A previous study (Asher & Eran, 
1959) indicated that single programs of E 
ries such as those produced by the Nationa 
Educational Television and Radio ig os 
have a measurable effect on viewers. wa : e 
this study demonstrated that attitude c anges 
toward certain concepts in the progim ES : 
be measured, it did not explore the effec a 
such change of already existing theoretica : 
related attitudes. The present investigim E 2 
tempted not only to study poanie, atti = 
changes that occurred as a result o z np 
a program but also endeavored to exa 
——— 


1 Supported in 
the senior author by the 
Vision and Radio Center, 


the impact of 


in-ai ded to 
a grant-in-aid awari 
rar Mona Educational Tele- 
Ann Arbor, Michigan. m 
Authors are indebted to psychology pem — 
George Woodward and Montrose, iig 
Sistance in implementing this project. 


11 


such changes in terms of an already existing 
theoretically related attitude. 

2. What is the effect of previous experience 
in taking a telecourse on attitudes toward tele- 
vision as an instructional medium? Since the 
subjects in our present investigation were stu- 
dents who had been in an institution where 
instructional television had been utilized for 
several years, it seemed appropriate to ex- 
amine the effect of such experience on atti- 
tudes. A previous report by the senior author 
(1956) suggested that factors other than tele- 
vision itself might account for unfavorable 
attitudes toward television. For example, dis- 
like of the subject matter of the telecourse, 
the instructor, or the grade received could re- 
sult in a displacement of feeling which would 
be expressed as an unfavorable attitude to- 
ward television. In effect, it is possible that 
the medium is blamed for disturbances stem- 
ming from attitudes toward other factors in 
the telecourse situation. In the present study, 
attitudes toward television as a function of 
experiences with telecourses were examined 
in terms of other possible contributing factors, 


METHOD 
Subjects 


The subjects were 160 students enrolled in five 
sections of introductory psychology at the Univer- 
sity of Houston. For the attitude change section of 
the study, one section (n — 33) was designated as the 
control group (C) and another (n = 24) as the ex- 
perimental group (E). The classes did not differ sig- 
nificantly in responses to the pretesting battery of 
tests as described below, so they could be considered 
adequately matched for purposes of the present in- 
vestigation. 


Experimental Program 


The television program used in the 
entitled Roots of Prejudice, is part of a 
produced by the National Education 


present study, 
series, Peo ple, 
al Television 


Te 


12 R. I. Evans, B. A. Wieland, and C. W. Moore 


and Radio Center? for national distribution. This 
program, one-half hour in length, presents a class- 
room situation in which psychology instructors dis- 
cuss with students the development and character- 
istics of racial and religious prejudice. Psychology 
graduate student and faculty previewers agreed that 
two goals of the program were to challenge the va- 
lidity of racial or religious stereotyping and to es- 
tablish the need for social tolerance. The program 
can clearly be described as controversial on the basis 
of its potential emotional impact on the social climate 
of Houston, a large, eastern-Texas community: at 
present, Houston has the largest segregated school 
system in the United States. 


Psychological Measuring Instruments 


A group of four social psychological measurement 
devices was used. The first of these, the Evans Quasi- 
Role-Playing Device, was developed for an earlier 
study by the senior author (1952). It is a semi- 
projective device which yields both quantitative and 
qualitative indices of the intensity 
ent’s prejudice toward social objects. 
is asked to imagine or “cognitively 
self as being either prejudiced or u 
to give as many reasons as he can 
diced or unprejudiced. In the presen: 
jects were asked to “role play” an 
able to and one hostile to the Su 
segregation decision, 


of the respond- 

The respondent 
role play” him- 
nprejudiced and 
for being preju- 
t study, the sub- 
individual favor- 
preme Court de- 
listing as many arguments as 
they could in favor of and in Opposition to the de- 
cision to desegregate the schools. 

The Second test used was a modification of the 
widely used Osgood Semantic Differential (Osgood, 
1957) designed to measure several dimensions of atti- 
tudes. Eleven scales were us 
tions, good-bad, pi 
valuable-worthless, 
clear-hazy, 


(weak-strong), scale (active-pas- 
Sive), so that each concept was measured by six 


of concepts toward 


One group of con- 
cepts used was related directly to ideas presented in 
the Program and included racial discrimination, inte- 
gration of the races, Ni 


egroes, white people, Blake 
ype), and Goldberg (a con- 


« ive-measure scores 
ific concept, Thus, each sub- 
aluative score for each of the 
expressed himself, 
*The authors wish to express their thank. 

National Educational Television and iat Sea 
for the loan of the experimental film Roots of 
Prejudice. ' É 


concepts on which he 


The third test used in the battery was the Ethno- 
centrism Scale (Adorno, Frenkel-Brunswik, Levinson, 
& Sanford, 1950). The Ethnocentrism Scale purports 
to measure the degree of general intolerance of racial 
and religious out-groups. A 10-item form was used. 

The fourth measure used, a personal data sheet, 
included another application of the Osgood Semantic 
Differential (Evaluative Dimension). In this instance, 
the good-bad scale was used to evaluate the attitudes 
toward concepts related to methods of course pres- 
entation of those students who had completed tele- 
courses at the university. Information was also ob- 
tained from the respondent concerning sex, major 
field in college, age, class, grade point average, and 
previous experience with telecourses. d 


Procedure 


Pretesting of both groups was carried out during 
a regular class meeting. Following the administratio 
of the pretests and during the next regularly sched- 
uled meeting two days later, the experimental grouP 
was shown the experimental television program, 
Roots of Prejudice, by closed circuit television in 
specially designed viewing rooms at the university- 
Posttesting was effected at the same session for the 
experimental group and at a comparable time for 
the control group. : 

The experimental procedures were instituted 1n 
such a way as to provide means for investigating 
the two problems: the extent to which a single view- 
ing of the program would affect viewers’ attitudes, 
and the extent to which students’ experience with 
telecourses would affect their attitudes toward tele- 
vision instruction. 

With reference to the first problem, the procedure 
was to compare pre- and posttest scores of the ee 
perimental and control groups. In the case of the 
other problem, the effect of the course-taking ex- 
perience with educational television on attitudes to- 
ward that medium, the analysis of data involve 
some of the responses on the personal data sheet and 
general Osgood scale responses, This provided for 
each subject a sum of four scores on the evaluative 
dimension for the topics, classroom lecture and tele- 
vision lecture, and provided information pertaining 
to the student’s experience (or lack of it) with edu- 
cational television, his success in courses taken DY 
television, and his evaluation of such courses (2 
good-bad Osgood scale). To investigate the influence 
of experience, the experienced and nonexperience 


sroups were compared in terms of their Osgoo 
Scores. 


RESULTS 
Effectiveness of the Program 


The control and experimental groups were 


Compared by means of Wilson's distribution- 


free test of analysis of variance hypothe- 
ses (Wilson, 


1956) in terms of their pretest 
Scores to determine whether or not they were 


Telecourses and Attitudes toward Television 13 


matched groups in terms of Osgood scale re- 
sponses with respect to the nine concepts. No 
significant differences were found. Therefore, 
in order to determine the effectiveness of the 
film and to eliminate the possibility of “re- 
sponse set" of the pretest itself, comparisons 
of C and E on the posttests were made using 
the same statistical test. Table 1 reports these 
data. It indicates the concepts examined and 
the chi square value for the posttest differ- 
ences on each of three Osgood factors, evalua- 
tive, potency, and activity. As is apparent 
with one degree of freedom and a chi square 
value of 3.84 necessary for significance at 
even the 5% level of confidence, none of the 
differences even approaches significance. 

Pre- and posttest responses in E on the 
Evans Quasi-Role-Playing Device were com- 
pared with respect to the number of “dislike 
and “like” responses made with respect to the 
Supreme Court desegration decision: Di both 
the pre- and posttests a 51% to 49% break- 
down of the proportion of dislike to like re- 
sponses was present, indicating, of course, in 
a quantitative sense no apparent effect of 
the film on expressed rationalizations for or 
against the Supreme Court decision. f 

“An analysis of the Evans test responses in 
qualitative terms by virtue of an examination 
of the content of the pre- and Jona mene 
viven for accepting or rejecting ise ges 
Court segregation decision also indicated no 

Enn. ¿perimental films. 
significant impact of the € rege tare A 
wipe c pue ones effect on 
vision progran = no s 

attitudes ot 1. xd 
ii eres more thoroughly the opel ue 
of some impact of the program as a 


of a theoretically related attitude, ethnocen- 


ri i uced. This 
trism, another procedure was introd 


M y 
involved the comparison of the ipe 
prejudiced (above the a gene D 
e eps id isa responses. For 
groups as measure -5 
ae um groups, medians pp qm 
for pretest and posttest scores o A icon 
good. If related attitudes operate em aaa 
in such situations, individuals wi 0 
scores on the ethnocentrism “ou e 
tend to be affected differently y ae 
The pre- and posttest medians a ete 
topics of the Osgood were compare yt 


TABLE 1 
Cur Square VaLuzs (1 df) or COMPARISONS BETWEEN 
POSTTEST RESPONSES ON Oscoop Scars or C 
(n = 33) axb E (n = 24) 


Concept Potency Activity 
Psychologists 16 0-4 26 
Racial prejudice 21 21 .06 
Classroom lecture 35 .00 .00 
Negroes 32 20 AT 
The name Blake 3 22 Evi 


E 

Integration of races E 4 A0 
"Television instruction 8 19 .01 
White people .02 01 00 
The name Goldberg 07 34 42 


Note.—Chi square equal to or greater than 3.84 is necessary 
for significance at the 5% level of confidence, 


No significant differences were found for the 
high prejudice group. For the low prejudice 
group, there was a shift in medians on both 
Blake and Goldberg on the evaluative factor 
toward a less favorable or valued position. 
The former difference was significant at the 
.01 level of confidence, the latter at the .02 
level. To rule out the possibility that two sig- 
nificant ¢ tests out of a group of 18 computed 
could be due to chance, an examination of a 
table for the test of significance for a series 
of statistical tests (Sakoda, Cohen, & Beall, 
1954) was made. This revealed significance 
of these findings approaching the 1% level of 
confidence. 

To determine the extent to which experi- 
ence with television as a medium of instruc- 
tion was related to students’ attitudes to- 
ward that medium, the total n of 160 was 
administered the personal data sheet and was 
divided into two groups, the television-experi- 
enced group (7 — 45), those who had en- 
rolled in and earned grades in one or more 
telecourses, and the television-nonexperienced 
group (7 = 115), those who had not been en- 
rolled in telecourses. These two groups were 
compared on the Osgood concepts, cl 
lecture, and television lecture, The t tests of 
the differences between medians for the two 
groups were not significant on either of these 
concepts. 


assroom 


14 


A ¢ test of the median values placed on the 
concepts, classroom lecture, and television lec- 
ture, by the TV-experienced group yielded a 
difference significant at the .01 level of con- 
fidence. This difference favored the classroom 
lecture as a medium of instruction. A likely 
inference from this result, of course, is that a 
less favorable attitude toward television in- 
struction resulted from experience in taking 
telecourses. It still seemed possible, however, 
that factors other than experience, per se, 
might have contributed to this result. There- 
fore, a phi coefficient was computed between 
attitudes toward television lecture and class- 
room lecture involving the total n of 160 to 
yield some insight concerning the amount of 
variance accounted for by this relationship. 
The obtained coefficient of -29, although sig- 
nificant at the .05 level of confidence, clearly 
Suggested that, indeed, a great deal of the 
variance was unaccounted for. Therefore, a 
phi coefficient between attitude toward tele- 
vision lecture and grades earned in the tele- 
course was computed for the TV-experienced 
group. This computation yielded a value of 
36, significant at the .01 level. It appears, 
therefore, that although negative attitudes to- 
ward television as a medium of instruction 
may partly result from experience per se with 
such instruction, Poor grades in the course 
may not be ruled out as a possible factor con- 


tributing to the formation of negative atti- 
tudes. 


Discussion 


Certain tendencies in the present study 
might be further examined. The fact that Os- 
monstrated that so-called 
measured by the E Scale, 
even less likely to stereo- 
ke and Goldberg and re- 


y communi- 
ge these strong 
holding less in- 
n this case Osgood responses 


cation messages which challen: 
attitudes, while individuals 
tense attitudes (i 


R. I. Evans, B. A. Wieland, and C. W. Moore 


which suggest only a slight tendency to stereo- 
type) are more vulnerable to such communi- 
cation messages. . 

With respect to the question of the attitude 
toward television as an instructional medium, 
there is evidence present in support of the 
proposition that *out of context" studies of 
instructional television may lead to invalid 
conclusions. The demonstration in the pres- 
ent study of the influence of grades received 
in telecourses gives some support to this pos- 
sibility. The contention proposed here is that, 
if all related variables (e.g., attitude toward 
the instructor, the subject matter itself, gen- 
eral motivation arousal characteristics of the 
course) could be held constant, television as 
a teaching medium would take its place along 
Side other educational media; that is, they 
are all equally well received in proportion to 
the extent that effective learning principles 
are implemented in their use. 


‘Summary AND CONCLUSIONS 


The present study was designed to investi- 
gate two questions: 

1. To what extent would a single contro- 
versial educational television program change 
viewers' attitudes? 

2. How would previous experience with tele- 
Courses affect students’ attitudes toward tele- 
vision instruction as contrasted with tradi- 
tional classroom instruction? " 

Introductory psychology students, compris- 
ing a control and an experimental group, were 
administered a group of psychological instru- 
ments consisting of the Ethnocentrism Scale, 
the Evans Quasi-Role-Playing Device, and 
nine concepts using the Osgood Semantic Dif- 
ferential (Evaluative Dimension). A personal 
data sheet elicited information concerning 
Subjects’ experience with and evaluation of 
telecourses, Following pretesting, the experi- 
mental group viewed by closed circuit tele- 
vision the program, Roots of Prejudice. Post- 
testing procedures involved the administra- 
tion of the Evans test and the Semantic Dif- 
ferential to both groups. 


Analyses of the data lead to the following 
conclusions: 


1. The educational 
Roots of Prejudice, su 
measured by immediate 


television program, 
cceeded, at least as 
posttest attitudes, in 


Telecourses and Attitudes toward Television 15 


causing relatively nonethnocentric viewers to 
engagé in less intense name stereotyping, did 
not alter attitudes of ethnocentric viewers, 
and, in brief, fell short of the apparent goals 
of the program. 

2. Negative attitudes toward television in- 
struction may be less the result of experience 
in taking telecourses than of such factors as 
poor course performance. In other words, tele- 
vision as a medium of instruction may be- 
come an available “whipping post," because 
of its novelty, for latent hostile attitudes aris- 
ing from other factors in the college course 
situation. 


REFERENCES 
ADORNO, T. W., FRENKEL-BRUNSWIK, ELSE, Lzvix- 
son, D. J., & Sanrorp, R. N. The authoritarian 
personality. New York: Harper, 1950. 
ASHER, J. J., & Evans, R. I. An investigation of some 
aspects of the social psychological impact of an 


educational television program. J. appl. Psychol., 
1959, 43, 166-169. 

Evans, R. I. Personal values as factors in anti- 
semitism. J. abnorm. soc. Psychol., 1952, 47, 749- 
756. 

Evans, R. I. An examination of students’ attitudes 
toward television as a medium of instruction in a 
psychology course. J. appl. Psychol., 1956, 40, 32- 
34. 

Evaws, R. I. An analysis of some demographic and 
psychological characteristics of an educational tele- 
vision audience. In R. W. Crary (Ed.), The audi- 
ence for educational television. (Res. Rep. No. 573) 
Ann Arbor: Educational Television and Radio 
Center, 1957. Pp. 33-46. 

Oscoop, C. E. The measurement of meaning. Ur- 
bana: Univer. Illinois Press, 1957. 

Saxopa, J. M., Comen, B. H., & Brarr, G. Test of 
significance for a series of statistical tests. Psychol. 
Bull., 1954, 51, 172-175. 

Wirsow, K. V. A distribution-free test of analysis of 
variance hypotheses. Psychol. Bull, 1956, 53, 96- 
101. 


(Received February 15, 1960) 


Journal oj Applied Psychology 
1961, Vol. 45, No. 1, 16-21 


SELF-DESCRIPTION AS A PREDICTOR OF SUCCESS 
OR FAILURE IN FOREIGN SERVICE 
CLERICAL JOBS 


REGIS H. 


WALTHER 


Department of State 


Cronbach (1960), Bogard (1960), and Ghi- 
selli and Barthol (1953) have reported stud- 
les showing a relationship between self-report 
questionnaires and job success. The present 
study was designed to determine whether re- 
sponses given by Foreign Service clerical per- 
sonnel to questions about attitudes, interests, 
activities, and family background could be 
used to predict success or failure in various 
clerical jobs in the Foreign Service and thus 


improve the quality of performance and re- 
duce turnover, 


METHOD 


Starting in the s; 
tion of the Civil 
ment of State gav 
employees the day 
item multiple choi 
tions relating 


pring of 1954 at the recommenda- 
Service Commission, the Depart- 
e all new Foreign Service clerical 
after they entered on duty a 68- 
ce questionnaire including ques- 
to grades and interests in school, job 
characteristics most liked or disliked, relationship 
With parents, social activities, steadiness of employ- 


ment, hobbies or outside interests, etc, Sample ques- 
tions are: 


Do you entertain groups at home: 
a. Frequently 
b. Occasionally 
c. Almost never 


Have you: 
a. Often been 
b. Sometimes 


c. Been doubl 
d. Never been 


double crossed by people 

been double crossed by people 

e crossed once or twice by people 

double crossed by people 

Were your parents: 
a. Always ve 
b. Usually v, 
c. Seldom v 
d. Never ve 


TY strict with you 
ery strict with you 
ery strict with you 
Ty strict with you E 
Which of the follow: 
least important 
a. Opportunit 
difficulties 
b. Opportunity to underst. 
perior expects work to 
c. Certainty ones work 
standards 


ing characteristics 
to you: 


y to ask questions 


of a job is 


and consult about 


and just how one’s su- 
be done 


will be judged by fair 


i ; : of 
d. Freedom in working out one's own method 
doing the work jade 
e. Co-workers—Congenial, competent, and <í 
quate in number, 


The responses given to these questions have n 
used onlv for research purposes, The sees 
group (N = 1183) consisted of all clerical qt 
who had completed the questionnaire prior to s 
cember 1, 1956. For each of these employees it pe 
determined (a) if he still was employed as of x 
Eust 30, 1959; (b) if he had resigned or been "n 
rated, how long he had served before leaving ; aoa 
the reason given for his resignation or ance 
(poor performance, dissatisfaction, marriage, or ee 
reasons); (d) the level of his performance on i 
average, average, or below average). The Bee 
nation regarding level of performance was Lear a 
the average of the summary numerical ratings C 
tained in the employee's personnel file. : Bs 

The data was first processed by dividing the ota 
jects into job categories and then by means of an ed 
formal item analysis determining which items ee 
te predict success or failure for the various job ei 
gories. It was found that there were substantial sid 
ferences among the employees assigned to the s 
job categories (secretaries, mail and records cler n 
code clerks, pouch clerks, and accounting clerks). 15 
was also found that the same items which seemed a 
predict tenure also appeared to predict quality 
work, — 
The next Step was to attempt to find the reaso % 
why particular responses seemed to be related to A 

cess or failure in the different clerical jobs. A sede 

of group meetings attended by individuals. who ha! 
either performed or supervised one of the jobs being 

Studied was held to get information about the satis 

faction contained in the work and the qualities pos- 

sessed by the individuals who do well as opposed a 

those who do poorly. A series of hypotheses was de 

veloped based both on pertinent statistical informa- 
tion as well as the information developed from Bx 
group meetings and the judgments of experience! 
personnel officers, These hypotheses related to the be- 
havioral requirements for success in the jobs being 
studied and the degree to Which they were important 
for the different jobs. Questionnaire responses. were 
then grouped together on the basis of four criteria: 

(a) some positive intercorrelation, not necessar y 

Statistically significant, (b) logical relationship to the 

hypotheses, (c) ability of the item to distinguish be- 

tween above average and below average performers 
in the job category in which the hypothesized be- 
16 


Foreign Service Clerical Jobs 


iT 


TABLE 1 
> 
QUESTIONNAIRE RESULTS BY JoB CATEGORY 


Experimental Group 


Cross-Validation Group 


Job Category N 


Mean SD N Mean SD 
I. Secretaries = 
A. Still Employed 
Above average 78 22 
8 .2 +3 18 29 
Average 92 —13 44 57 20 3.8 
Below average 14 —85.6 Me 4 33 
Doing other work 18 =14 3.5 —11.0 7.8 
l'otal 202 =$ 4.7 49 -i3 ET 
B. Resigned or Separated i 
Poor performance 31 —5.9 4.6 
" A 4 ay B 11 - 5 
Dissatisfaction 23 EE 57 H 2 43 
Other reasons 137 —1:6 47 js = 43 
Marriage 126 =$ 42 10 m 6.4 
‘Total (excluding marriage) 191 EM 49 30 a 38 
‘Total (including marriage) 317 -1.8 4.9 40 TE = 
: -27 8 
II. Code Clerks 
A. Still Employed 
Above average 28 E 3.8 19 P 
Average 52 EE m 13 K : 2.3 
Below average 7 =i 3.2 7 —2.5 2a 
Total 87 i 3.8 30 E 1 M 
^ 3.0 
B. Resigned or Separated 
Poor performance 24 —3.4 25 5 35 
Other 47 =l 34 12 " 
Total 71 —1.3 3.0 17 ae 
TIT. Mail and Records Clerks 
A. Still Employed 
Above average 16 EI 24 = dis 
Average 24 —1.0 3.6 12 dá 30 
Below average 8 -3 84 —34 bi 
‘Total 48 -5 3.2 29 -—1.4 48 
B. Resigned or Separated 
Poor performance 8 = 4.0 4 —33 36 
Other 39 sid 4.2 7 13 33 
Marriage ne -23 3.8 
"Total 8s —2.6 4.0 11 —20 32 


^ 

ment appeared to be most important, 
e item to distinguish above average 
job category presumably most in- 
haracteristic from above average 
employees in the job categories least influenced. by 
the characteristic. For example, it was hypothesized 
that a high degree of social isolation was a negative 
consideration for secretaries and that high degree of 
Sociability was a negative consideration. for code 


havioral require 
(d) ability of th 
employees in the j 
fluenced by the c 


clerks. Therefore items which logi " 

lated to degree of social join qub di i. 
guished above average from below aver: PES 
taries, and which also distinguished basoen oe 
average secretaries and code clerks wer T above 
together. The judgments were made b ag gtk placed 
From this process there emerged veh 
items, each cluster suggesting a 5 
style or capability. 


the writer. 
; x clusters of 
a particular behavioral 


18 


he next step was to select 20 high performers and 
ues performers for each of the job tesan Te 
high group consisted of employees who had s de 
with the job for at least 3 years and consistently 
had received a superior rating. The low groups con- 
sisted of employees whose services clearly were Boc 
satisfactory. Weights were given to the items so as 
to create the maximum numerical spread between the 
high and low groups and between the high groups 
ior different occupational categories. Weights were 
also given to each of the seven clusters for each of 
the occupational categories, thus developing scoring 
keys for each of the seven clusters of items and for 
the five job categories. The weighted scoring keys 
were then applied to the experimental group and 
adjustments made in weights for the individual items 
and for the cluster in order to improve the effective- 
ness of the scoring keys. The results are contained in 
Table 1. 

The cross-validation group was composed of all 
Foreign Service clerical employees hired between De- 
cember 1, 1956 and March 30, 1957 for secretaries, 
and between December 1, 1956 and December 30, 
1957 for all other clerical employees (excluding those 
who had transferred to Civil Service), plus any cleri- 
cal employee hired after these dates whose services 
clearly had been unsatisfactory. The results of this 
part of the study are also contained in Table 1. 

For the most part the duties of the five jobs can 
be inferred from the titles. The secretary performs 
typical secretarial duties except that there is a more 
frequent change in bosses due to the fact that per- 
sonnel are transferred Periodically from post to post. 
The mail and records clerk classifies and files incom- 
ing documents and searches for particular documents 
upon request. The code clerk follows a set procedure 
for enciphering and deciphering telegrams. He must 

have some mechanical aptitudes and be able to work 
alone without direct supervision. For security reasons 
he works in a confined area behind a locked door 


Regis H. 


Walther 


and the work does not involve much social inter- 
course during the workday. He either works on shifts 
or must be available for call during the off hours. 
The pouch clerk performs clerical and m 
work related to receiving and dispatching all m t 
pouches transmitting mail between the Departmen! 
of State and Foreign Service posts. The accounting 
clerk performs various record-keeping functions re- 
lated to the disbursement of funds, and the obliga- 
tion and liquidation of accounts at Foreign Service 
osts. 
P The criteria used to measure success were panim 
ance ratings and tenure on the job. However, or 
the performance criteria was used for developing ke 
scoring keys. Because most personnel decisions à à 
made in Washington, the Department of State com 
siders it essential that the written record for the us 
ployee be adequate, and therefore a great hc. is 
effort is expended to make the ratings as usefu as 
possible. Each individual included in the study a 
been rated at least three times by a minimum of b 
raters, plus at least one rating by a Foreign Ser 
inspector. During 1955, 124 Foreign Service employ 
were given a special rating with the raters assu 
that the rating would be used solely for eee 
purposes. Three evaluations were obtained for ar 
employee, one from the supervisor, one from the 4 
ministrative officer, and one from a co-worker. 7 
average of these ratings was found to correlate A 
with the ratings contained in the employee's P' 
sonnel file. 

Accurate records were available regarding the bes 
of service of all the employees included in the p. 
Each resigning employee is required to send a le si 
giving his reasons, and in about half the cases €? 
interviews are conducted. Employees included o 
experimental or cross-validation group who had p 
separated or had resigned were divided into Mid 
categories: Poor performance, Dissatisfaction, E 
riage, and Other reasons, If a review of the € 


TABLE 2 
i COMPARISON OF ABOVE AVERAGE EMPLOYEES Em 
Experimental Group Cross-Validation Group 
Job Category N Mean SD t N Mean SD t 
1. Secretarial Key 
a. Secretaries 78 z3 4.3 x 18 2.0 3.8 30* 
b. Other clerical employees 84 —3.6 44 104 35 —2.8 4.6 i 
2. Code Clerk Key 
a. Code clerk: 5 5 
Pens Taks 30 m 38 ip B 5S 23) 46 
». Other clerical employees 132 =2.3 3.8 34 —-4 3.4 
3. Mail and Records Clerk Key 
a. Mail and Records clerk 16 zit 24 10 —.7 2.5 
b. Other clerical employees 146 —24 iat 18 43 aa 24] : 
* Significant at the .01 level or better. 


Foreign Service Clerical Jobs 19 


TABLE 3 
COMPARISON OF ABOVE AND BELOW AVERAGE EMPLOYEES ON INDIVIDUAL ELEMENTS 


Experimental Group 


Cross-Validation Group 


Above Avg. Below Avg. Above Avg. Below Avg. 
Job Category Mean SD Mean SD t Mean SD Mean SD 1 
A. Secretaries (N = 78) (N = 45) (N = 18) (W = 14) 
Please Authority 16 17 -13 18 89 13 21 -16 22 34* 
` Social Isolation —26 30  —1 44 50* -31 18  —6 28 30" 
Social Leadership 20 20 10 18 2.8* 20 14 G LO 23“ 
Reaction to Aggression —1 17 8 13 3.1* 0 10 8 16 19 
Accept Standards 25 9 9 23 41* 28 7 A Bh BSF 
Accept Routines 17 34 4 32 3 23 3.6 9 40 10 
Use of Intellect 42 25 25 3.3 6 $0 18 20 20 14 
B. Code Clerks (N = 28) (N = 31) (N = 19) (N = 12) 
Please Authority —3 20 3 27 10 Q 23 0 20 
Shcial Taolition is “eT —14 3.2 zv 3 33 —24 2.9 20 
Social Leadership i 1.8 8 1.8 9,2* 3 1.4 1.6 1.8 23 
Reaction to Aggression E 1:5 zd 1.4 0 12 —2. 8 
Accept Standards 1.9 2.0 1.7 24 3 2.0 1.8 1.6 2.3 6 
Accept Routines 2.6 40 —3 28 3.6* 22 gr 7 24 — 16 
2.7 2.6 27 $3 3.6 22 3.4 1.4 


Use of Intellect 


* Significant at the .05 level, 
SX Significant at the .01 level or better. 


ployee's file indicated a poor performance record, he 
was placed in the Poor performance we AE 
gardless of the reason for the separation. Inclu e i 
the “Other reasons” category were individuals who 
probably would have been placed in the : ‘Poor per- 
formance” or “Dissatisfaction” categories if more in- 
formation had been available about their perform- 


ance or real reason for leaving. 
RESULTS 


- the best results came from the secre- 
Ni abeo When the secretarial key was 
applied to the cross-validation "T it we 
possible to predict performance vi a pea 
uct-moment correlation of .60 an turno i 
excluding marriage with a kr ic rad 
lation of .49. The scores on the S nal 
elements were surprisingly consistent e me 
the experimental and rie ira Es P 
The predictions for the code clerks P E 
a correlation of .38 for pui ue 
for turnover. For the mail and re pue 
it was .28 for performance and " r a 
Over, The results for the pouch Es iin 
disappointing. It is believed that the F espn 
for this is that pouch clerks are motiv: 


Primarily by nonjob factors. 


Table 2 gives the results of comparing the 
scores of above average secretaries, code 
clerks, and mail and records clerks. For secre- 
taries the differences are significant at better 
than the .01 level for both the experimental 
and cross-validation groups. Table 3 shows 
the comparison of above average and below 
average employees on individual elements. 
For the secretaries the difference between the 
two groups was significant at better than the 
.01 for three of the elements. 


Discussion 


The seven clusters of items suggested dif- 
ferent behavioral styles and capabilities which 
have an important influence on work. Three 
of the clusters (Please Authority, Accept Rou- 
tines, and Accept Standards) seemed to relate 
to the way the individual likes to get his guide 
lines or instructions for doing his work— 
whether he responds primarily to the wishes 
of authority, the orders of authority, or the 
standards of authority. The next three ways 
(degree of Social Isolation, Social Leadership 
and Reaction to Aggression) seemed to relate 


20 Regis H. Walther 


to the way the individual deals with people. 
The last cluster (Use of Intellect) seemed to 
relate to the way the individual wants to use 
his mental capacities. 

The following assumptions have been de- 
veloped regarding the specific meaning of 
these elements and the effect that a high or a 
low score has on work: 

Please Authority. This element relates to 
the degree to which the individual tries to 
please persons in authority. It involves more 
than obedience, as it includes personal satis- 
faction in getting approval from the boss. An 
individual who scores high on this factor 
Seems to prefer to work closely with a super- 
visor, is sensitive to his wishes, and often 
knows what the boss wants or what he will 
do, without being told. A person who has a 
negative score probably is not able or willing 
to anticipate the wishes of the boss. He is 
likely to prefer work involving a minimum 
of personal supervision. Examples of posi- 
tive items defining the element are: excellent 
grades in high school and not being charged 
with speeding during the last 5 years, Exam- 
ples of negative items are: frequently arguing 
with parents during adolescence on political, 
religious, and other issues and considering the 
opportunity to understand just how one's su- 


perior expects work to be done less important 
than the other Choices. 


Acceptance of Routines, 
lates to the degree to which 
cepts imposed routine 
low it. A person with 


This element re- 
an individual ac- 
and is content to fol- 


a high rating seems to 
prefer specific, concrete activities rather than 
ones involving variety and change, and to 
want to work under specific instructions. The 
instructions can be communicated either per- 
sonally or impersonally. A negative score 
Seems to indicate a desire for variety and 


change and an impatience with routine, Ex- 
amples of positive i 


This element re- 
h the individual 
ds. A person with 
to do things wel] 


and is able to work within set tolerance or 
standards. A person with a negative score 
may not do too well with work which requires 
unusual accuracy. Examples of positive items 
are: rarely or never disagreeing with. parents 
during adolescence on political, religious, s0- 
cial, or other issues, and always meeting finan- 
cial obligations promptly. Negative items are: 
getting fair or poor grades in school, and par- 
ents always or never very strict. 

Social Isolation. This element relates to the 
degree to which the individual withdraws from 
people and avoids social relationships. A d 
son with a high rating seems to prefer to von 
with things or data rather than people and kn 
to object to working in isolation. A negat! 
rating seems to indicate a desire to work Ve 
people and a dislike of isolation. Examples al 
positive items are: attends parties or ee 
affairs several times a year or almost ae 
and has only a few friends. Negative iter 
are: entertains groups at home frequently an 
likes to play bridge. iad 

Reaction to Aggression. This element r 
lates to the degree to which the individu. 
takes direct action when he thinks his di 
are being violated. A person with a high rs 
responds to what he perceives as aggressi a 
behavior on the part of other people with sd 
gressive behavior of his own. A negative ge 
ing seems to indicate a tendency to placa B 
avoid, divert, or ignore aggressive og or 
Examples of positive items are: never 
difficulty getting rid of salesman and um 
plains to waiter whenever complaint is Jus e 
fied. Negative items are: never been oe 
crossed by people and never breaks into co 
versation. m 

Social Leadership. This element relates "s 
the degree to which the individual takes H^ 
lead in social activities. A person with a n 
Score seems to prefer activities involving the 
influencing of others through social type ee 
lationship. A negative score may indicate : 
preference for activities involving data T 
things rather than people. Examples of porn 
tive items are: when in school was a membe 
of several clubs and organizations and ban 
held positions as an officer in an organization 
during the last 5 years, Negative items aI€: 
almost never entertains groups at home an 
never has held a position in an organizatioP- 


Foreign Service Clerical Jobs 21 


Use ef Intellect. 'This element relates to the 
degree to which an individual wants to use 
intellect when dealing with problems of work. 
'The higher the score the more likely the in- 
dividual will be interested in planning, using 
judgment, and solving problems. The lower 
the score the more desire for routine, repeti- 
tive, short cycle activities. It seems likely that 
each job has a range for this factor. Examples 
of positive elements are: likes to play chess 
and stood in the top of 10% of his college 
class. Negative items are: reads newspapers 
once a week or less and had no hobbies as a 
child. . 

With respect to the specific occupational 
categories it appears that a successful secre- 
tary likes to please the boss and is responsive 
to his requests, accepts standards, and is in- 
clined to take the lead in social relationships. 
On the other hand any significant degree of 
social isolation, resistance to or rejection of 
authority, or rejection of routines or bounda- 
ries are predictors of failure. She is likely to 
deal with aggression from others in other 
ways than fighting back. If she scores above 
a certain point on the Social Leadership or 
Use of Intellect scale she is likely to transfer 

© other work or to resign. 

gr canet code clerk seems to be less 
sociable, shows less social leadership, and is 
more willing to accept routines than the poor 
code clerk. There was general. agreement 
among officers experienced with either super- 
vising or assigning code clerks that they pre- 
ier to be supervised primarily by regulations 
rather than people. . g 
E ihe phia of experienced ri 
that the mail and records clerk ciim o wd 
reliant, be able to make decisions xia , 
be able to negotiate successfully yiil i er 
offices on procedural problems relating to a 
ords work, be able to work under pen 
and budget her time, and work ine “died 
riety of tasks recognizing the rela a eos 
tance of each. None of the elements 1 s n 
through this study measured these cha 


teristics, perhaps because the items in the 


questionnaire were not sufficiently compre- 
hensive. There was a slight suggestion from 
the data that the mail and records clerks were 
more aggressive than the other clerks with re- 
spect to getting other people to do what they 
ought to do. It should be noted that the mail 
and records group has relatively low prestige 
and many individuals assigned to this work 
consider that they have been sent to “Siberia.” 
Therefore, it is likely that many individuals 
are influenced significantly by the social mean- 
ing of the work rather than the satisfaction 
they personally get from it. 

No statistical report is being made on ac- 
counting clerks because not enough subjects 
were included in the cross-validation group to 
permit the predictions to be tested adequately. 
The above average accounting clerks included 
within the experimental group scored high on 
the acceptance of standards and had moderate 
scores on the other elements. 


SUMMARY 


In this paper it has been reported that a 
questionnaire designed to do something else 
has proved to be reasonably successful at pre- 
dicting success or failure in various Foreign 
Service clerical occupations. Seven behavioral 
elements which relate to ways of responding 
to authority, ways for dealing with people, and 
level of use of intellect appeared to be sug- 
gested by the data. It seems likely that the 
effectiveness of the questionnaire can be im- 
proved by redesigning it specifically to meas- 
ure the behavioral style and capability ele- 
ments which have emerged from this study. 


REFERENCES 


Bocarp, H. M. Union and management trainees: A 
comparative study of personality and occupational 
choice. J. appl. Psychol., 1960, 44, 56-63. 

CnowBacH, L. J. Essentials of psychological testing. 
(2nd ed.) New York: Harper, 1960. 

Guisettt, E. E. & BARTHOL, R. P. The validity of 
personality inventories in the selection of em- 
ployees. J. appl. Psychol., 1953, 37, 18-20. 


(Received February 19, 1960) 


Journal of Applied Psychology 
1961, Vol. 45, No. 1, 22-2 


ABILITY TO FOLLOW INSTRUCTIONS AS A PREDICTOR 
OF SUCCESS IN RECRUIT TRAINING? 


FERDINAND STERN * 
United States Naval Training Center, San Diego 


AND LEONARD V. GORDON 


United States Naval Personnel Research Field Activity, 


Essentially, the purpose of recruit training 
in the Navy is to effect an orderly and pro- 
gressive adjustment to military life and to 
provide, through information and practice, 
basic military indoctrination which will con- 
tribute to successful service (USN Bureau of 
Naval Personnel, 1953). In this indoctrina- 
tion, the recruit obtains information on vari- 
ous topics through lectures, films, and demon- 
strations. In addition, he receives training in 
such military practices as barracks care, cloth- 
ing care, personal cleanliness, and basic mili- 
tary drill. 

The degree to which the recruit acquires 
information is assessed through weekly writ- 
ten quizzes and an objective examination 
given at the end of recruit training, and is 
highly predictable by certain aptitude meas- 
ures in the Navy Classification Battery. His 
progress in acquiring efficiency in the funda- 
mental military practices is subjectively evalu- 
ated on a continuing basis by his company 
commander, and is predicted with only very 
moderate success by Navy Classification Bat- 
tery scores. Since early identification of indi- 
viduals who will fail to meet requirements in 
military practices is desirable, some relevant 
predictive factors other than those presently 


measured by the Navy Classification Battery 
must be found, 


Gordon and Alf ( 
a sample of school 
follow directions i 


2 Opinions expressed herein are those of the authors 
and do not necessarily reflect those of the Depart- 
ment of the Navy, 

2 Now at Central Sta 


te Hospital, 
consin. 


Waupun, Wis- 


22 


San Diego 


fication Battery. The present study was un- 
dertaken to determine whether direction-fol- 
lowing ability would also be predictive of the 
recruit's success in meeting minimal stand- 
ards in basic military practices. 


PROCEDURE 


The sample in the present study consisted of n 
recruits, 511 of whom had graduated from TM 
training, and 52 of whom had been discharged EU 
reasons of inaptitude prior to completion of sical 
training period. Individuals discharged for med (hs 
Or disciplinary reasons were not included in 
sample. i 1 

Al recruits had been given the Navy ce 
Battery during the morning of their third day in kr 
service. This battery contains four tests: (a) of 
General Classification Test (GCT) consisting di 
analogy and sentence completion items; b) al 
Arithmetic Test (ARI) containing computation, 
and arithmetic reasoning items; (c) the Mec 
Test (MECH) consisting of tool and electrical ean 
edge, and mechanical comprehension items; and he 
the Clerical Test (CLER), a test of number-mat¢ 
ing ability. t 

[^^ tie morning of the third day, prior to the agg 
ministration of the Navy Classification Battery; * 
test of ability to follow directions was experimens 
tally administered. The test, the Oral Direction E 
(ODT) (Langmuir, 1952), provides a measure of n 
functional efficiency of the individual in understan 
ing what he is told to do. It consists of a series O 
instructions presented orally by means of S pecore 
The examinee is to carry out each instruction on h 
answer sheet, "t 

Two criteria were used in the present study. Ti 
first was the dichotomous criterion of inaptitude wi 
charge-graduation from recruit training. Inaptitude 
discharges were given to recruits unable to achieve 
minimum standards in military practices such Fi 
marching, physical exercise, personal cleanliness, an 
barracks care, and whose conduct created various 
disciplinary problems, The second criterion was per- 
formance on the Recruit Final Achievement Test- 
This objective examination is administered at the 
end of recruit training to measure information a€- 
quired during the course of training. 

Biserial correlation coefficients were obtaincd be- 
tween all predictor variables and the discharge- 


Prediction of Success in Recruit Training 23 


graduation criterion. Product-moment correlations 
were obtained between all predictors and the Re- 
cruit Final Achievement Test scores for graduating 
recruits, 


RESULTS 


Table 1 presents means, standard deviations, 
and validities of the Oral Directions Test and 
Basic Test Battery variables against the dis- 
charge-graduation criterion. All validity co- 
efficients are significantly different from zero 
well beyond the 1% level of confidence. The 
Oral Directions Test has the highest biserial 
correlation with the discharge-graduation cri- 
terion. 

Table 2 presents validities of all predictor 
variables against the Recruit Final Achieve- 
ment Test (RFAT) criterion for the graduat- 
ing recruits. All product-moment coefficients 
of correlations are significantly different from 
zero beyond the 1% level of confidence. For 
this group, and with this criterion, the Gen- 
eral Classification Test has the highest va- 
lidity. The validities of the Arithmetic Test, 
Mechanical Test, and Oral Direction Test are 
moderate and of about equal magnitude. 


Discussion 


The results of the present study suggest 
that a measure of the ability to follow in- 
structions may be fairly highly predictive of 
one type of success in recruit training but not 
of another. In the initial part of recruit train- 
ing, much emphasis is placed upon the de- 
velopment of simple habits of cleanliness, 
neatness, obedience, and the like. Individuals 
who are poor at understanding or following 
oral instructions are unlikely to grasp these 


TABLE 1 
AND BisERIAL CoR- 


— ARD DEVIATIONS, 
Means, STANDA IABLES AND 


RELATIONS BETWEEN PREDICTOR VARI 3 
A DISCHARGE-GRADUATION CRITERIO) 


(N = 563) i 

Variable M SD Fois 

ln GCT 51.4 89 34 
ARI 49.8 7.6 :: 
MECH 51.8 aa 5 
CLER 46.9 7. = 

30.4 5.0 52 


^ODT 


TABLE 2 


Propuct-MoMENT CORRELATIONS AMONG PREDICTOR 
VARIABLES AND THE RECRUIT FINAL ACHIEVEMENT 


TEST 
(WN = 511) 
Variable ARI MECH CLER ODT RFAT 
GCT .59 Al .28 54 .58 
ARI .25 40 39 38 
MECH 12 36 36 
CLER 36 18 
ODT 36 


fundamentals with sufficient rapidity to en- 
able them to meet service standards. 

On the other hand, the ability to follow in- 
structions, as measured by the Oral Directions 
Test, is only very moderately predictive of 
the individual’s ability to reproduce, on a 
final achievement test, facts learned during 
recruit training. It contributes nothing, for 
example, to verbal intelligence as measured by 
the General Classification Test, the multiple 
correlation having the value of .58 which is 
identical with the validity of the General 
Classification Test alone. 

The General Classification Test, while only 
moderately predictive of the inaptitude dis- 
charge criterion, is quite successful in predict- 
ing Recruit Final Achievement Test perform- 
ance. The verbal content of the General Clas- 
sification Test, the Recruit Final Achievement 
Test, and the training media undoubtedly ac: 
count for the substantial relationship obtained. 

A comparison of the results of the present 
study with those reported by Gordon and Alf 
is of interest. In the latter study, a median cor- 
relation of .35 was obtained between number 
of errors made in marking an interest inven- 
tory and Recruit Final Achievement Test 
scores; in the present study a correlation of 
.36 was found between Oral Direction Test 
scores and the same criterion. The measures 
of direction-following ability used in the two 
studies differed in their obtained relationship 
to verbal intelligence. The number of errors 
made on the interest inventory was unrelated 
to scores on the General Classification Test. 
while the Oral Direction Test scores had a 


24 Ferdinand Stern and Leonard V. Gordon 


significant relationship with this measure. 
This lack of relationship in the Gordon and 
Alf study may have been due to restriction in 
range resulting from their use of a school- 
eligible sample. 

The results of the present study indicate 
that a test of ability to follow instructions ap- 
pears to have value in predicting a significant 
part of a particular criterion complex. Fur- 
ther exploration of the utility of such a test in 
the military services appears to be warranted. 
Whether such a test would have equivalent 


value in industrial settings would have to be 
situationally determined. 


REFERENCES 


Gonpox, L. V., & Arr, E. F. An analysis of errors 
made in marking an interest inventory. J. gen. Psy- 
chol., in press. 

LawcMur, C. R. Oral Directions Test. New York: 
Psychological Corporation, 1952. 

Usitep States DEPARTMENT or THE Navy. Burcau 
of Naval Personnel. Curriculum for recruit train- 
ing. Washington: USN BNP, 1953. 


(Received February 23, 1960) 


Journal of Applied Psyckology 
1961, Vol. 45, No. 1, 25-29 


A PSYCHOLOGICAL INVESTIGATION OF A GROUP OF 
DEMOGRAPHIC, PERSONALITY, AND BEHAVIORAL 
VARIABLES AS THEY RELATE TO VIEWING 
EDUCATIONAL TELEVISION' 


RICHARD I, EVANS 


University of Houston 


Recent studies by Asher and Evans (1959) 
and Evans, Wieland, and Moore (1961) 
showed evidence of changes in attitudes as a 
result of viewing a specific educational tele- 
vision program under controlled conditions 
with selected samples. The impact of educa- 
tional television (ETV) in general, however, 
is more difficult to evaluate. As previously 
suggested by the writer (1957), size of audi- 
ence is not a sufficient criterion. Another pos- 
sible criterion is the degree to which programs 
seen are discussed with others. Such a hori- 
zontal interaction pattern would indicate im- 
pact beyond the group of individuals actually 
watching ETV. E 

Earlier studies of educational television au- 
diences (Adams, 1956; Crary, 1957; Evans, 
1957; Geiger & Sokol, 1957; Merrill, 1957; 
Schramm, 1957) have been principally con- 
cerned with analysis of demographic charac- 
teristics; only peripheral attention has been 
paid to personality and behavioral charac- 
he study previously cited (Evans, 
however, suggested that activity pat- 
terns of frequent viewers of educational tele- 
vision might differ significantly from Ss 
nonviewers. Frequent viewers often appeare 
to have a self-improvement gnent ion mani- 
fested in such behavior as attending ectures 

r i Jess noticeable in non- 
and concerts, which was e er 
viewers. This observation suggests the pos 


1The present study W 


teristics. TI 
1957), 


as supported in part cdd 
eae MAC pa 
grant-in-aid from the National pipe ied 
vision and Radio Center, Ann Arbor, Mss ss 
Thanks are expressed to psychology ma e Eos 
dents Lawrence Simkins and Pat pem ins A 
tiem 4i ing and data analysis, H. r 
participation in planning , A 
ones f the Biology Department m Dupes 
i " scion and data analysis, psycho 
interviewer supervision ant S. qe 
ogy ies duate T rndeni David Shaw for EU ks 
Nom analvsis, and psychology student wae 
í po ilati als 
Osmond for assistance in compilation of mate! 


for the present report. 


bility of underlying personality differences be- 
tween the two groups. 

The present study was designed to explore 
some of the possible differences which were 
suggested by earlier research between frequent 
viewers and nonviewers of educational tele- 
vision. It was decided to examine more thor- 
oughly apparent differences in behavior of the 
two groups and to investigate the possibility 
of underlying personality differences. The re- 
search design also called for a further com- 
parison of the two groups with respect to cer- 
tain demographic characteristics. Specifically, 
the present study sought to answer the fol- 
lowing questions: 

1. Do frequent viewers and nonviewers of 
KUHT-TV (the University of Houston tele- 
vision station) differ with respect to occupa- 
tion, socioeconomic level, sex, educational 
level, race, or age? The previous study 
(Evans, 1957) indicated no difference with 
respect to these characteristics. 

2. How do the two groups compare with 
respect to certain measures of personality? 
(a) Are nonviewers of KUHT-TV more likely 
to be dogmatic than frequent viewers? Rok- 
each (1954) has defined a dogmatic indi- 
vidual as one who is inclined to be intoler- 
ant, rigid in his beliefs, opposed to new ideas, 
and closely allied to what he considers posi- 
tive authority. Because educational television 
may be considered a possible source of dis- 
turbing ideas, it was hypothesized that few 
dogmatic individuals would be frequent view- 
ers, since they would tend to avoid exposure 
to information which might threaten their be- 
lief systems. (b) Are nonviewers more likely 
to be authoritarian than frequent viewers? 
An authoritarian personality, as defined by 
Adorno, Frenkel-Brunswik, Levinson, and Sä- 
ford (1950), tends to be anti-intraceptive, 


26 


ethnocentric, conventienal, politically con- 
servative, and responsive to external rather 
than internal authority. It was hypothesized 
that authoritarians would be less likely than 
equalitarians to watch educational television, 
since it might be thought of as endangering 
existing social practices. (c) Can the two 
groups be differentiated in terms of achieve- 
ment motivation? As defined by McClelland, 
Atkinson, Clarke, and Lowell (1953), need 
for achievement denotes an intrinsic, some- 
times compulsive, drive for accomplishment 
which dominates a large portion of the indi- 
vidual’s activity. Since viewing ETV might 
be one means of fulfilling a need for achieve- 
ment, it was hypothesized that frequent view- 
ers might have a higher need for achievement 
than nonviewers, 

3. How do the two groups compare with 
respect to certain types of behavior? (a) Do 
frequent viewers and nonviewers differ 
ing frequency? 
(Evans, 
viewer as 
be expecte 
viewer. ( 


in vot- 
Results of the earlier study 
1957) pictured the frequent ETV 
Civic-minded; he might, therefore, 
d to vote more regularly than a non- 


b) Are there any significant differ- 
ences between the two groups in leisure time 


activity preferences? If frequent viewers do 
have a more marked self-improvement ori- 
TS, as suggested above, 
ed in their leisure time 
quent viewers have more 
and discuss programs on 


SAMPLING 


The present Sample was chosen from a 
larger sample used in a previous study (Evans, 


cedure developed by 
subdivided into freq 


Richard I. Evans 


viewers, former viewers, and nonviewers of 
KUHT-TV. Since the original interviews were 
conducted in May 1956, and the present in- 
terviews between July 15 and August 15, 
1957, it was expected that some changes 
would have occurred in viewing frequency 
during the interval of a year. For the present 
study frequent viewers were defined as those 
who watched one ETV program per week, and 
nonviewers were defined as those who had not 
watched ETV within the last 6 months. Be- 
cause of the relatively small audience for 
ETV, these definitions were necessary in Or- 
der to obtain an adequate sample for analy- 
sis. On the basis of these definitions, 100 fre- 
quent viewers and 100 nonviewers were se- 
lected from the original sample. Interviews 
were completed with 137 members of this 
group; the remainder could not be contacted. 
Because some of the interviewees selected 25 
frequent viewers on the basis of the w^. 
study could no longer be so classified, fina 
results were based on interviews with 42 n 
quent viewers and 72 nonviewers. Althouse 
the sample was necessarily limited, it was stil 
reasonably representative of frequent viewers 
and nonviewers of ETV in the greater Hous- 
ton area. 


MzTHODOLOGY 


The questionnaire used in the present study n 
cluded fixed-alternative, general open-end, and pro 
jective questions. : the 

As personality measures, modified versions of 3 
Dogmatism (D) Scale of Rokeach (1954), the a 
thoritarian F Scale of Adorno et al. (1950), and te 
Need for and Value of Achievement Scales (as 
and vAch) of McClelland et al. (1953) were uset: 
Shorter forms of these scales were developed in or- 
der to insure sufficiently high response motivation 
throughout the interview. Certain D- and F-Scale 
items were reworded to avoid fallacious responses 
that might result from acquiescent tendencies 07 
other types of response set. This control measure 
was suggested by Christie, Havel, and Seidenber£ 
(1958), who demonstrated the reversibility of F- 
Scale items and analyzed the acquiescent factor origi- 
nally involved in this measure of authoritarianism. 

Several open-end projective questions reported by 
Adorno et al. (1950) were used to gain added in- 
sight into authoritarianism, Responses to these ques- 
tions were coded in terms of the suggested cate- 
gories. Three judges estimated a cumulative index ° 
authoritarianism for each respondent, based on F- 
Scale scores and coded responses to the projective 
questions. This index was used in the statistical 


Demographic Variables and Educational Television 27 


analysis? Similarly, a cumulative index of dogmatism 
for each individual was based on the several D-Scale 
scores. Achievement motivation level for each re- 
spondent was based on four open-end questions. Be- 
havioral information was sought by means of sev- 
eral fixed-alternative questions, plus general open- 
end questions regarding social interchange with 
friends on the subject of KUHT-TV programs. 

Six trained interviewers, including Baylor Univer- 
sity medical students and University of Houston psy- 
chology graduate students, interviewed respondents 
in 137 television homes during the period from July 
15 to August 15, 1957. Where possible, the opere 
material was precoded for IBM analysis. A group i 
Project assistants served as content analyzers for t e 
Open-end projective questions. The usual reliability 
Checks were made among raters. A final E E 
code was established by the writer in conference ma 
the assistants. Information from the sour m 
was then punched on the IBM cards. Using : saan 
Square method of spanning the significance O 1 Wu 
group differences, responses to the Hueso eis 
compared to determine whether significant di A 
ences existed between frequent viewers and no 
viewers of KUHT-TV. 


RESULTS 


The first problem considered in the present 
study was the relationship of a group of 
demographic variables to ETV viewing. No 
significant differences were indicated with re- 
spect to occupation, income, sex, and race. 
The age category, with a chi square of 7.69 
and two degrees of freedom, yielded a con- 
fidence level of .05, indicating that frequent 
viewers tend to be significantly older 
nonviewers. More specifically, based ww. e 
age group categories of 0-29 pen € 
years, and 50 plus years, the analysis sug 
gested that educational television eine = 
creases in the 30-50 age m and reaches its 

ak in the 50 plus age group. 
goce rolen considered was Hon 
lationship between viewing eh is 
group of personality variables, : d n 
thoritarianism, Dogmatism, ant BES 
Achievement. Analysis of the dai des 
differentiate py the "ad 

ese variables. . 
ES eei red concerning aes 
variables, yielded the most interesting C = 
Statistically significant differences pie wes 
between frequent viewers and nonvie es 
respect to voting behavior, -— pts 
tivity preferences, and discussio 


programs. 


With regard to voting, 7596 of the sample 
of frequent KUHT-TV viewers reported that 
they voted in every election, or nearly every 
election, whereas only 40% of the sample of 
nonviewers stated that they voted with such 
regularity. The chi square difference between 
the two groups of 14.35 with two degrees of 
freedom proved to be significant at the .001 
level of confidence, showing a highly signifi- 
cant tendency for frequent viewers to vote 
more frequently than nonviewers. 

Another interesting clue to behavioral dif- 
ferences between the two groups was found in 
answers to the question, “How would you best 
like to spend a leisure evening?" Three cate- 
gories of answers were coded: (a) passive 
noninvolving activity, which included such 
spectator recreations as watching television, 
going to movies, and reading fiction; (5) pas- 
sive involving activity, which included such 
information-value recreations as attending a 
lecture and reading nonfiction; (c) partici- 
pative activity, which included such social 
recreations as going to a party and playing 
cards. With respect to this group of variables, 
the chi square difference between frequent 
viewers and nonviewers was significant at the 
-05 level of confidence, thus conforming to 
our criterion of statistical significance. The 
frequent viewer of educational television pre- 
ferred information-value recreations such as 
attending lectures and reading nonfiction, 
whereas the infrequent viewer preferred social 
recreations such as going to a party and play- 
ing cards. 

Portions of the present questionnaire were 
designed to help estimate the extent and im- 
pact of program discussions by viewers of 
educational television. The respondent was 
asked if his friends watched KUHT-TV, if he 
discussed the programs on KUHT-TV or top- 
ics related to these programs, and if such dis- 
cussions had been of value. Significantly more 
respondents who were themselves frequent 
viewers of KUHT-TV had friends who 
watched KUHT-TV and who mentioned pro- 
grams seen, or discussed topics related to 
these programs. The differences between fre- 
quent viewers and nonviewers on these ques- 
tions were significant at the -01, .001, and .01 
levels of confidence, respectively. 

These findings suggest that the frequent 


28 Richard I. Evans 


viewer may discuss educational television with 
other viewers more often than he promotes it 
to nonviewers. This is not to say, however, 
that there is no conversational contact be- 
tween frequent viewers and nonviewers. It is 
to be noted that more than a third of the non- 
viewers did have friends who watched and 
discussed KUHT-TV. It may be, moreover, 
that those who do not watch educational tele- 
vision regularly do not know whether or not 
their friends watch it, and do not recognize. 
as such, topics that are related to ETV pro- 
grams. In support of this hypothesis, inspec- 
tion of the responses does show that nonview- 
ers used the “don’t know” response more fre- 
quently than frequent viewers in answer to 
the question, “Do any of your close friends 
watch KUHT-TV?” 

Other questions directed toward evaluating 
the impact of educational television were con- 
cerned with whether or not the respondent 
felt that the programs he had watched or dis- 

ny way. One ques- 
actually watched 
view, and a second 
pics brought 
Interestingly 
s, almost as 
given to the 
A reason for 


ion among viewers 
educational televi- 
€ community and 
aluation. 

re less inclined than 
they had benefited 


ence. This may be 
» by the fact that 
gaged in as much dis- 


Pi s UN 
frequent Viewers. Among nonviewers who did 
report that friends had mentioned some aspect 
of educational television, 60% felt that this 


information had been of some value. In the 


case of frequent viewers, 85% felt ihat in- 
formation gained from ETV programs had 
helped them in some way. . - 
Analysis of the responses concerning pos- 
sible personal benefits derived from actually 
watching or merely discussing KUHT-TV 
programs showed little difference between the 
impact of these two types of exposure to ETV 
program content. One reason for this was 
the tendency for respondents, especially fre- 
quent viewers, to discuss the informative 
value of particular programs they had actu- 
ally watched, even in answer to the question 
which sought to determine whether or not i 
respondent had been helped by discussions ° 
educational television topics in general. . 
Since an important part of the total el 
pact of educational television programs seg 
to involve post-program discussion, an analy- 
sis was made to determine how the various 
programs compared with respect to the ¢* 
tent that they were discussed by viewers. Th 
regularly televised bimonthly meeting of thé 
Houston School Board appeared to be m 
most discussed program. More than a fourt 
of those people who had heard discussions © 
KUHT-TV mentioned this program by name, 
Among frequent viewers of KUHT-TV, 40% 
stated that they had either watched the schoo" 
board program the week of the interview 0 
had discussed the program at some time. —— 
The results presented here indicate that fu 
ther study of ETV audiences with respect tO 
behavioral characteristics would be of value 
The possibility that frequent viewers of edu- 
cational television may possess some distin" 
guishing personality characteristics shoul 
also be explored further, despite the negative 
results to date. The fact that the frequent 
viewer does appear to differ significantly from 
the nonviewer with respect to behavioral char 
acteristics strongly suggests some underlying 
personality differences, which more sensitive 
instruments might reveal. . 
Further research on the total impact 1? 
depth and breadth of educational television 15 
also indicated. The developnient of adequate 
measures of impact is in itself an important 
problem. It may be that a. viewer-panel typ 
of instrument, such as that used in the previ- 
ously cited study by the writer (1957), would 
be more useful for estimating impact than 4 


Demographic Variables and. Educational Television 


unilateral questionnaire of the type used in 


the present study. 
Tn conclusion, it should be pointed out that 


the results of the present study are encourag- 
ing for the educational television movement. 
Our findings indicate that frequent viewers 
discuss and act upon ideas and information 
gained from educational television programs. 
Such viewers could be instrumental in con- 
tributing to the success of this social move- 
ment. 
REFERENCES 


of viewers and 


Apams, J. S. An exploratory study 
pel Chapel Hill: 


non-viewers of educational television. 


University of North Carolina, Chapel Hill Insti- 
tute of Research and Social Science, 1956. 
Aporno, T. W., FRENKEL-BRUNSWIK, Erse, LEVIN- 


son, D. J., & SANFORD, R. N. The authoritarian 
personality. New York: Harper, 1950. 

ASHER, J. J. & Evans, R. I. An investigation of 
some aspects of the social psychological impact of 
an educational television program. J. appl. Psy- 
chol., 1959, 43, 166-169. 

Cnristir, Rọ, Haver, J. & Srrpenserc, B. Is the F 
Scale reversible? J. abnorm. soc. Psychol., 1958, 
56, 143-159. 

Crary, R. W. An educational perspective on the edu- 
cational television audience. In R. W. Crary (Ed.),’ 
The audience for educational television. (Res. Rep. 
No. 573) Ann Arbor: Educational Television and 
Radio Center, 1957. Pp. 2-19. 


29 


Evaws, R. I. An analysis of some demographic and 
psychological characteristics of an educational tele- 
vision audience. In R. W. Crary (Ed.), The audi- 
ence for educational television, (Res. Rep. No. 573 
Ann Arbor: Educational Television and Radio 
Center, 1957. Pp. 33-46. 

Evans, R. I, Wietanp, Berry, & Moore, C. W. The 
effect of experience in telecourses on attitudes to- 
ward instruction by television and impact of a 
controversial television program. J. appl. Psychol., 
1961, 45, 11-15. 

GEIGER, K., & Soxor, R. Educational television in 
Boston. In R. W. Crary (Ed.), The audience for 
educational television. (Res. Rep. No. 573) Ann 
Arbor: Educational Television and Radio Center, 
1957, Pp. 86-115. 

Hovcnurawp, D. Estimating dog population by a 
sampling procedure. Publ. hHth. Rep. 1956, 71, 
296-297. 

McCretranp, D. C., ATKINSON, J. W., CLARKE, 
R. & Lowe rt, E. L. The achievement motive. 
New York: Appleton-Century-Crofts, 1953. 

Merritt, I. R. Benchmark television-radio study. 
In R. W. Crary (Ed), The audience for educa- 
tional television. (Res. Rep. No. 573) Ann Arbor: 
Educational Television and Radio Center, 1957. 
Pp. 61-74. 

Roxeacu, M. The nature and meaning of dogmatism. 
Psychol. Rev., 1954, 61, 194-204. 

ScigAxrM, W. The audience for educational television 
in the San Francisco Bay Area. In R. W. Crary 
(Ed). The audience for educational television, 
(Res, Rep. No. 573) Ann Arbor: Educational Tele- 
vision and Radio Center, 1957. Pp. 20-32. 


(Received March 10, 1960) 


Journal of Applied Psychology 
1951, Vol. 45, No. 1, 30-34 


MANAGERIAL ROLE INTERACTION: 
A STUDY OF THREE MANAGERIAL LEVELS 


HJALMAR ROSEN 


University of Illinois 


All managers in industrial concerns, with 
the exception perhaps of the very highest 
echelon, must integrate their roles as sub- 
ordinates and superiors, The literature, in 
general, implies that the integration process 
is heavily weighted in terms of satisfying the 
demands of the immediate superior rather 
immediate subordinates (Dubin, 
Herzberg, Mausner, 
7; McGregor, 1944), 
The reasons Proposed for such upward ori- 
round a high upward 
1948) or the auto- 
usiness organization 
r, 1944), 

i.e., the implementa- 
the direction of ac- 


t ? is one of the primary 
Tunctions of management, it is difficult to 


unfortunately, in the present State of knowl- 
gea 


Subjects 


This study was 


conducted in a 
sized industrial pl 


moderately 
ant located in a 


small mid- 


30 


western urban center. The subjects included 
the plant manager, seven division managers 
directly under his supervision, and 37 section 
managers responsible to members of top ma 
agement. All subjects had a supervisory func 
tion as part of their role obligation. As id 
could be determined through interviews pes 
the participants, the plant manager was P Ü 
marily an integrator and policy maker. DM 
sion and section management had policy M 
ing functions, but within a relatively pa: 
sphere. Of primary importance to them "A 
their responsibilities in implementing the 

cisions of their respective superiors. 


METHOD 


i j- 
The instrument utilized in this studý was a v 
ple ranking device. The stimuli to be ranked Pur 
made up of 16 occupational role prescriptions ( si 
in first column of Table 2 below), primarily M. 
reference to the human relations aspects of A 
vision, garnered from the literature and from an yere 
lier study (Rosen & Rosen, 1957), These stimuli v Es 
ranked by the subjects in order of their importa 
under the following restrictive conditions: vei 
1. “Qualities men in your position should ha 
i.e., personal role evaluations 55 xou 
2. "Qualities you would most like to see in y ub- 
(a) immediate superior(s) and (b) immediate S 
ordinate(s) ,” le. demands upon others si 
3. “Qualities you think your immediate (a) 5 
perior(s) or (b) subordinate(s) would feel are m 
desirable for men in your position,” ie. predict! 
9f demands of others all 
Division management ranked the stimuli under m- 
conditions. Their superior, the plant manager, CÓ 5 
pleted the Personal role evaluations, demands un 
subordinates, and prediction of subordinate demena 
The middle management group completed the P i 
sonal role evaluations, demands upon supérior, 2! 
Prediction of superior demands, vel 
Rank data were combined for each managerial Es 1 
into an average rank for each of the role variable i 
Interrelationships between variables were then prt 
lyzed by means of Spearman rho. The resultant © 
relations are Presented in Table 1. 
pene oe 
* With the mana, 
Scores were ch: 
tions; conse 
due to th 


u^ 
st 


gement classes, the average m. E 
aracterized by small standard devi: 
quently, the classes Were treated as un? 
eir response homogeneity. 


Managerial Role Interaction 


TABLE 1 


SPEARMAN RHO CORRELATIONS BETWEEN ROLE VARIABLES 
FOR THREE MANAGERIAL LEVELS 


I. Prediction of Demands vs. 
Personal Role Evaluations 


Division Management 


Prediction of Superior vs. Personal Role Evaluations +.92** 
Prediction of Subordinate vs. Personal Role Eva i à 
E e Evaluations +.45* 
Prediction of Subordinate vs. Personal Role E i 
r E a Evaluat 
Section Management m Te 
Prediction of Superior vs. Personal Role Evaluations 4-.86** 


IL. Accuracy of Predictions 
Division Management 


Prediction of Superior vs. Superior Demand 4-33 
Prediction of Subordinate vs. Subordinate Demand 4.52" 
Plant Manager Do 
Prediction of Subordinate vs. Subordinate Demand +.36 
Section Management i 
Prediction of Superior vs. Superior Demand +.75** 


II. Adequacy of Personal Role Evaluations 


Division Management 


Personal Role Evaluations vs. Superior Demand +.41 

Personal Role Evaluations vs. Subordinate Demand 4738" 
Plant Manager 

Personal Role Evaluations vs. Subordinate Demand dai 
Section Management 

Personal Role Evaluations vs. Superior Demand +.78** 


IV. Personal Role Evaluation and Demands 
upon Superior (s) and Subordinates 


Division Management 
Personal Role Evaluations vs. Demand upon Superior nom 


Personal Role Evaluations vs. Demand upon Subordinate +.81** 


Plant Manager 
Personal Role Evaluations vs. Demand upon Subordinate 4-.94** 


Section Management 
Personal Role Evaluations vs. Demand upon Superior , n 


V. Relationships among Role Demands 
of Three Managerial Levels 


Division vs. Section Management 


Demands upon Superior pee 
E Division Management vs. Plant Manager 
Demands upon Subordinate 3-0 
VI. Relationships among Personal Role 

Evaluations of Three Levels 
Division Management vs. Plant Manager dde 
Division Management vs. Section Management 4. gee 
Section Management vs. Plant Manager PR 


Note, Significance of Spearman rho's using one-tailed test (Siegel, 1956) are as follows: 
* At .05 level: p =.425. 
** At ,01 level: p —.601. 


I 
ISI 


RESULTS AND DISCUSSION 


Prediction of Demands and Personal Role 
Evaluators 


From Section I, Table 1, it is clear that per- 
sonal role evaluations of division management 
were significantly related to their prediction 
of both superior and subordinate demands. It 
should be noted, however, that their personal 
role evaluations were more highly related to 
their predictions of superior than subordinate 
demands. Corollary information supported 
these trends. The plant manager’s personal 
role evaluation was moderately and signifi- 
cantly related to his predictions of subordi- 


nate demands. Section management's personal ` 


role evaluations were significantly related to 
their prediction of superior demands. 

In general, these results Suggest that there 
is a tendency to perceive that personal role 
evaluations are highly comparable to superior 
demands, whether or not this position is valid. 
This may be taken as partial evidence for the 
upward orientation of management personnel 
Proposed in the literature, Of equal impor- 
tance, however, are the results that suggest 
that an upward orientation does not preclude 
some recognition of subordinate demands, 
i.e, upward orientation is a matter of degree 


and does not imply a rejection of subordinates 
by management personnel, 


Accuracy of Predictions: Prediction of De- 
mands versus Actual Demands 


Hjalmar Rosen 


dict the demands of the other to a significant 
degree. 


Adequacy of Personal Role Evaluations | 


Section IIT, Table 1, indicates that the per- 
sonal role evaluations of the divisional man- 
agement level were not significantly related to 
the actual demands of their superior, but Were | 
highly related to the actual demands of m | 
subordinates. Again, corollary data do aa 
support these trends. Section Diets cama 
personal role evaluations were nd 
parable to the actual demands of B 
periors, whereas the plant manager's pers fed. 
role evaluations were not significantly rela d 
to the actual demands of their subordinate 
Stated somewhat differently, division and e 
tion management levels’ personal role eve ps 
tions were highly related to each other's - 
mands, whereas those of the plant manag 
and division management. were not. ia 

This suggests, in the light of the ee 
data, that fulfilling the role demands of t Té 
in interlocking managerial levels may al 
largely a function of the accuracy of pr 
dictions rather than the desire to meet jd 
demands of one level more than those of " 
other. Moreover, it tends to confirm E 
pothesis that there may be barriers to € e 
tive interaction between the plant manag! 
and division management. 


on 
Personal Role Evaluation and Demands UP 
Superior(s) and Subordinates 


t 

It is clear from Section IV, Table 1, be 
the demands management makes upon t 
Superior(s) and subordinates is largely 
function of how they evaluate their perso! et 
roles. This suggests that managers do not ce 
sider that Supervisory style is a function. E 
the position of the manager within the e 
archy, but rather that "managing" cails for 
common pattern of attributes. 


al 


he 
Relationships among Role Demands of th | 
Three Managerial Levels 


Section V, Table 1, indicates that divisio" | 
and section management have much in pi 
mon re the demands they make upon the 
immediate Superior(s), whereas the plan’ 
manager’s and the division management’s de 


Managerial Role Interaction 33 


TABLE 2 
RELATIVE IMPORTANCE OF SIXTEEN MANAGERIAL ROLE PRESCRIPTIONS IN TERMS OF PERSONAL 
Rote EVALUATION FOR GENERAL, Top, AND MIDDLE MANAGEMENT GROUPS 


Managerial Role Prescription Top Mgt. Gen. Mgt. Mid. Mgt. 

1. Knows job thoroughly 3 ll Je 
2. Plans in terms of the future 1 2 5 
3. Takes pains to make himself understood 7 6 7 
4. Gives credit where credit is due 9 12 8 
5. Judges subordinates in terms of merit only 13 14 15 
6. Maintains discipline fairly 11 15 13 
7. Liberal but consistent in enforcing company rules 16 10 16 
8. Takes personal interest in subordinates 14 4 119 
9, "Takes responsibility upon himself 4 9 3 
10. Fights for what he believes in 12 16 10 
11. Cooperative regarding suggestions made to him 10 13 12 
12. Keeps promises to subordinates to the best 

of his ability 8 8 6 
13. Keeps subordinates informed of what is going on 5 9 
14. Explains the reasons behind each of his orders 15 1 jui 
15. Investigates each problem thoroughly before 

making a decision 2 3 4 
16. Leads rather than drives his subordinates 5 1 2 


a Areas of major discre] 


those in immediate subordinate 
positions are not significantly related. Given 
adequate communication of demands, this 
would suggest that within this organization a 
common supervisory style would evolve from 
the division management level down, but that 
the advancing manager having reached the 
level of divisional management would be at 
odds with his superior. 


Relationships among Personal Role Evalua- 

tions of the Three Levels 

Section VI, Table 1; indicates that the = 
managerial levels studied tended to rerom 
their respective occupational roles in à som. 
cantly comparable manner. It should E ne A 
however, that there was à considerably hig E 
degree of commonality between section Ls 
division managements' personal role evalua- 
tions than between either level and the plant 
To the extent that personal role evaluations 
provided the bases for both prediction of de- 
mands of others and demands. made upon 
those in other levels, the relatively greater 
comparability between section and division 
managements than between division manage 


mands upon 


pancy in ratings of general management versus subordinates. 


ment and the plant manager are understand- 
able. Given a poor communication of role de- 
mands between management levels, the effect 
would be more pronounced between the divi- 
sion management and the plant manager with 
relatively unrelated personal role evaluations 
than between section and division manage- 
ments having relatively highly related per- 
sonal role evaluations. 

In part, the discrepancies between the plant 
manager and the managers in the two lower 
echelons with regard to personal role evalua- 
tions may be düe to differential role demands 
associated with their positions in the hier- 
archy. Inspecting the raw ranks in Table 2 
one will note that three prescriptions ac- 
counted for most of the discrepancy. The 
plant manager put considerably less stress on 
*job knowledge" (see Item 1) and consider- 
ably more stress on “taking personal interest 
in subordinates" (see Item 8) and “explain- 
ing reasons behind orders given to subordi- 
nates” (see Item 14). It is possible that the 
plant manager in the role of generalist and 
integrator may have been somewhat deprecia- 
tory of such a simple-minded Prescription as 
“job knowledge” when, in effect, his major 


34 


problem is that of utilizing information of 
others and arriving at solutions to problems. 
With diversified issues to be settled, the only 
commonality is decision making or problem 
solving—difficult to classify as “job knowl- 
edge." Lower echelon management, having 
less of an integrative function and more of a 
technical function as a consequence may have 
assessed “job knowledge" as being relatively 
more important. 

The two other variables that the plant man- 
ager upgraded in importance relative to his 
subordinates are almost. textbook examples of 
"good human relations" approaches. In this 
case, the plant manager seemed to be more 
in the avant-garde of the human relations 
movement than the lower echelons of man- 
agement. The plant manager having reached 
the peak organizationally, and in Katona's 
(1951) terms, perhaps having identified him- 
Self with the organization to a greater degree 
than did his subordinates, may have put more 


weight on variables leading to organizational 
welfare. 


SUMMARY 


Hjalmar Rosen 


curacy of prediction of demands. In cases 
where predictions paralleled the actual de- 
mands, the impact was apparent, where pre- 
dictions were erroneous, the impact upon per- 
sonal evaluations was negligible. The tend- 
ency to expect of those in other levels what 
one demands of self again tends to provide a 
necessary but not sufficient condition for de- 
veloping a common set of values re super- 
vision among various levels of management. 
Finally, it was suggested that in the absence 
of adequate role demand communications, 
differential demands due to position in the 
management hierarchy may have in part ac- 
counted for the relatively lower commonalities 
in supervisory values between the plant man- 
ager and those in subordinates positions. 


REFERENCES 
Dus, R. Human relations in administration. New 
York: Prentice-Hall, 1951, í 
Henry, W. E. The business executive: A study be 
the psychodynamics of a social role. Amer. J. 5¢ 
ciol., 1949, 54, 286-291. s. 
Herzzere, F., Mausuzn, B., Peterson, R. O., & oe 
WELL, Dora F. Job attitudes: Review of restarted 
and opinion. Pittsburgh, Penn.: Psychological Serv 
ice of Pittsburgh, 1957, r 3 
Katona, G. Psychological analysis of economic bt 
havior. New York: McGraw-Hill, 1951. in in 
McGrecor, D. Conditions of effective leadership ; 
an industrial organization. J. consult. Psychol» 
1944, 8, 55-63. m 
Rosen, H., & Rosrw, R. A. H. The union Buses 
agent’s perspective of his job. J. personnel Admit 
industr. Relat., 1957, 3, 49-58. 


Srzczr, S, Nonparametric statistics, New York: MC 
Graw-Hill, 1956. 


(Received March 18, 1960) 


Journal of Applied Psychology 
1961, Vol. 45, No. 1, 35-40 


ATTITUDES TOWARDS NEWSCASTERS AS MEASURED 
BY THE SEMANTIC DIFFERENTIAL: 


A DESCRIPTIVE CASE 


PAUL M. KJELDERGAARD ! 


Harvard University 


The semantic differential is a combination 
of word association and scaling techniques 
developed by Charles Osgood of the Univer- 
sity of Illinois in conjunction with his work 
on verbal behavior (Osgood, 1952). Concepts 
are rated on seven-point scales bounded by 
polar adjectives, e.g., Good-Bad, Kind-Cruel, 
etc. Adverb modifiers qualify each step on the 
scale; the greater the intensity of the asso- 
ciation, the more extreme the displacement 
towards one or the other polar term. Each 
concept is rated on several scales so that the 
semantic profiles of the concept can be 
plotted. By computing Ds (the generalized 
distance function in n-dimensional space) 
the geometrical relationships between con- 
cepts or “meaning” may be found (Osgood & 
Suci, 1952). ‘ a 

In the first use of the technique in its 
present form, Stagner and Osgood (1946) 
measured the changes in attitude toward cer- 
tain stereotypes as this country became p 
volved in World War II. Successive groups 0 
subjects rated such concepts as Pacifist, Rus- 
sian, and Dictator at various times between 
April 1940 and March 1942. The instrument 
proved sensitive to the predicted changes La 
the *meaning" of the social stereotypes. ; n 
addition it was found that certain T = 
scales, e.g., Fair-Unfair, gio, "e 7 
Low correlated very highly with i pow: 
yet were relatively independent Ae pons 
Scales, e.g. Strong-Weak, Happy- om a 
latter. finding led to the use of a mee 
in conjunction with the semantic iffe " = 

The potentiality of the instrument x p 
ion research was shown in a study khe: a 
Suci, and Tannenbaum (1957) in whic x 
Semantic differential was used in er p 
With a public opinion poll conducted in 

1The author wishes to express his 
James J. Jenkins for his assistance in 
of this manuscript. 


appreciation to 
the preparation 


Champaign-Urbana, Illinois area. A sample of 
150 subjects representative of the voting 
public of this area was tested four times dur- 
ing the summer and fall of 1952. These sub- 
jects rated 20 concepts, 10 of persons, e.g., 
Truman, Taft, and Stalin, and 10 of policies, 
e.g. use of the atomic bomb, our foreign 
policy, the New Deal. By matching the differ- 
ential profiles of the “undecided” voters with 
profiles of Stevenson or Eisenhower support- 
ers, the experimenters were able to predict 
how 18 out of 19 “undecided” actually voted 
at the polls. 

Applications have also been made in the 
field of advertising research where studies 
have been conducted on the effect of color on 
advertised products, the effect of titles on the 
significance of pictures, and the influence of 
slogans on their advertised products (Osgood 
et al., 1957). Mindak (1956) used the differ- 
ential to compare the effectiveness of certain 
advertising appeals, e.g., the scientific appeal, 
the romantic appeal, etc., used in radio com- 
mercials for a new cosmetic product. 

This brief review only suggests the versatil- 
ity and the diversity of application of the 
instrument. Many studies of theoretical im- 
portance to psychology in such fields as verbal 
behavior, personality, clinical, and the dy- 
namics of social attitudes (Jenkins, Russell, 
& Suci, 1958; Osgood & Luria, 1954; Osgood 
et al., 1957) have also employed the semantic 
differential. Other areas of application men- 
tioned by Osgood et al. (1957) are studies of 
symbolism, aesthetic judgment, and interpre- 
tation of sonar signals. 

This article deals with a novel application 
of the instrument as a selection and evalua- 
tion device in casting a new television news 
program. Of equal importance is the fact that 
this study provides a measure of the stability 
of attitude changes which were induced by 
the viewing of a new program. Thus, the re- 


36 


search provided additional evidence concern- 
ing the reliability and validity of the semantic 
differential in applied research. 


PROBLEM 


Station Y-TV planned to reschedule its 
late evening news broadcast in an attempt to 
win back some of the news audience it had 
lost to station Z-TV. This was to be a com- 
pletely new program with new personalities, a 
new stage setting, and the latest television 
electronic “gimmicks.” In order to get an 
audience reaction, an outside research firm 
was called upon to pretest the new program 
with an audience via a closed-circuit TV in 
the station's studios. 

The station was in 
audience's reaction to 
and the TV personal 


grams. There was some 
he would project the m 


t ratings would be 
ho appeared on the 
did not appear 
popular and most 


in the area and in a loose sense 


Paul M. Kjeldergaard 


a criterion or target against which to compare other 
newsmen. Furthermore, if we assume the ratings of 
individuals are independent, then írom a theoretical 
standpoint there should be no change from the pre- 
show to the postshow ratings of Individual X. Thus, 
the ratings on this newscaster provide us with an 
estimate of the reliability of the instrument. 

The 20 subjects which were called in to view the 
new program were drawn from the consumers panel 
of a local market research organization. The group 
tended to be nonrepresentative of the population of 
this area in being more highly educated and in Us 
higher income bracket than a typical listening audi- 
ence would be. By and large the sample was made 
up of couples rather than randomly selected indi- 
viduals. The group was given a token reward for 
having participated in the preview. 


RESULTS AND DISCUSSION 


Figure 1 represents the semantic profiles 
for the three individuals (X—the newscaster 
on competing Station Z, A—the new news- 
caster for Station Y, and B—the editorialist 
and former newscaster for Station Y) over 
the three rating periods. Side-to-side com- 
parisons indicate the relative merits of these 
individuals as viewed by the audience at each 
time period. Up and down comparisons on 
the changes that take place in an individua 
profile over time. 

The preshow ratings from Figure 1 demon- 
strate a number of things. First, it is clearly 
indicated that Newscaster X enjoyed tremer- 
dous popularity and is held in high esteem by 
this group. His mean ratings on 8 of the 12 
traits are 6.25 or higher, and on 3 of the 4 re- 
maining scales, extremely high ratings coul 
be interpreted as probably being undesirable. 
Individuals A—Station Y’s newscaster an 
B—the editorialist have mean semantic pro” 
files on the preshow ratings which displac? 
only slightly away from the neutral positio” 
on most scales. There is little difference be 
tween their profiles except that A is viewed 25 


much more Warm, Pleasant, and Interesting 
than B. 


* A three page table giving the means and standard 
deviations on which Figure 1 is based has been de 
posited with the American Documentation Institute 
Order Document No. 6550 from ADI Auxiliary Pub- 
lications Project, Photoduplication Service, Librat, 
of Congress; Washington 25, D. C., remitting in 2 
vance $1.25 for microfilm or $1.25 for photocopies: 
Make checks Payable to: Chief, Photoduplicatio” 
Service, Library of Congress. 


Attitudes towards Newscasters 


37 


Pre Show Ratings 


7 6 5 
Po oT? o5 2 | 
Active a i Passive 
Composed Nervous 
Warm Cold 
Aggressive Withdrawn 
Fast Slow 
Friendly Unfriendly 
Important Unimportant 
Sincere Insincere 
Pleasant Unpleasant 
Interesting Uninteresting 
Mature 1 Immature 
Informed . Uninformed 
Post Show Ratings 
Active d Passive 
Composed Nervous 
Wirm Cold 
Aggressive Withdrawn 
Fast Slow 
Friendly Unfriendly 
Important Unimportant 
Sincere Insincere 
Pleasant Unpleasant 
Interesting Uninteresting 
Mature Immature 
"— Uninformed 
Follow Up Ratings 
Active 95 Passive 
Composed Nervous 
Warm cog 
ms Withdrawn 
Fast Ran 
Friendly Riesa 
Unimportant 
Important ! 
x Insincere 
Sincere 
Unpleasant 
Pleasant dj 4 
‘i Uninteresting 
Interesting 
isto Immature 
un 
Uninformed 
Informed 
Individual A Ue Individual B 


— Individual X 


Fic. 1. Mean semantic 
periods: before a 


stshow ratings, one finds 
that marked changes have taken place as a 
result of viewing the program. Only for In- 
dividual X have the verbal stereotypes re- 
mained relatively unchanged; he suffers a 
Slight regression towards the neutral position 
On most traits (X = -12)- The close corre- 


Turning to the po 


differential ratings on three television personalities rated at three time 
n experimental program, after the program, and 15 weeks later, 


spondence between his two profiles reflects 
the high reliability of the instrument—a find- 
ing consistent with those of other investigators 
(Osgood, 1952; Jenkins et aL, 1958). ` 
Individual A's mean rating increased on 
each of the 12 traits. Although he still failed 
to exceed Individual Z on any trait, the dif. 


38 


ferences between the ratings of the two in- 
dividuals on all traits were reduced. The 
largest remaining difference (Maturity) had 
been reduced from 1.81 to .60 as a result of 
the program. His profile mirrored that of X, 
an experienced newscaster, aíter only one 
news program. . 

The verbal stereotypes associated with In- 
dividual B made changes in both directions, 
thus dispelling the notion that there was a 
general increase in mean ratings due to the 
recency of having seen the program. Further- 
more, the magnitudes of the negative changes 
for B (the editorialist) were relatively great, 
whereas the changes in a positive direction 
were small and insignificant. 

The follow-up ratings were obtained after 
a 15-week interval had elapsed. During 12 of 
these weeks Individual A did the newscasting 
for the new Program; and Individual B no 
longer appeared on this program, although he 
did participate in other programs for Station 
Y. Newscaster X was on his regular program 
on Station Z throughout the 15-week period. 

A supplementary questionnaire showed that 
this group watched news on the new program 
about twice as frequently as they had on the 


old Y-TV show. How much of this increase 


was due to ego-involvement with this new 


Program as opposed to other possible vari- 


ables such as a more favorable time, a better 
liked Cast, etc., 


should be noted tha 


njoyed a general increase in 
follow-up mean ratings over his postshow 
ratings, gaining on all but one trait —Matu- 
rity. The mean increases wer 
and the general 
two previous r. 


Paul M. Kjeldergaard 


themselves as possible explanations of this 
phenomenon, the most credible of which is 
that the group was reinforced for their nega- 
tive attitude toward B—he no longer ap- 
peared on the program—thus intensifying 
their negative feelings. Many variables were 
operative over the elapsed time, however, and 
any explanation must remain conjecture. 

The follow-up ratings of Individual A 
showed slight mean increases on 8 of the 12 
traits over his postshow ratings. Whereas on 
the earlier ratings, A had mean ratings con- 
sistently below X, he now exceeded X on or 
trait—Informed—and tied him on severa 
others. His profile was now generally closer tO 
X's with the gap being wider on only two 
scales—Fast and Aggressive. 

Unfortunately, it is impossible to pum 
tests of significance on the changes Wild 
took place, for all measures are intercorre 
lated. It is possible to express numerically p 
differences between any two profiles, althoug 
no satisfactory statistical tests have been a 
veloped to test these differences. The measu" 
referred to is D (the generalized distance 
function) which is defined as D= vždy 
where di; is the differences in means for the a 
polar pair and the j® individual rated. TUR 
measure takes into account the differences hg 
tween profiles as well as the correlation be 
tween them. d 

Table 1 presents 27 of the possible 36 pre 
file comparisons. Ds for individuals in oni 
time period compared to other individuals i" 
other time periods were omitted because they 
lacked meaningfulness. 4 

Although there are no statistical tests b. 
evaluate the magnitude of Ds, a crude esti 
mate of the amount of random errors asso 
ciated with this technique in this situatio” 
can be obtained from the D value for Indi 
vidual X for the comparison of preshow an 
postshow profiles. If the instrument had a 
fect reliability and judgments were indepen 


: s i individua 
. ent, i.e., changes in ratings of one individu 


did not effect the ratings of other individuals, 
the D value for this comparison would ce 
zero. The value obtained 65, the square roo 
of the squared differences over 12 pairs, can 
be interpreted as being very small; therefore 
it appears that the instrument has relatively 


Attitudes towards Newscasters 39 


TABLE 1 
PROFILE Ds COMPUTED FOR THE THREE IxDiVIDUALS on Eacn or Taree Time PERIODS 


Pretest Posttest Follow-up 
X A Be A Be X A Be 
x 0.00 
Pretest A 3.85 0.00 
B 5.64 3.03 0.00 
X 0.65 0.00 
Posttest A 2.28 1.21 0.00 
B 1.66 5.63 444 0.00 
x 0.95 1.25 0.00 
Follow-up A 3.24 1.05 1.89 0.00 
B 3.08 3.18 7.80 7.42 0.00 
a The Ds Same för B are based on only 11 scales and are thus somewhat smaller comparatively. 


high reliability and the ratings appear to be 
relatively independent. Ms 

Using this value as a reference point, it 1s 
apparent that the majority of the Ds are of a 
sufficient magnitude so that the observed 
differences are not likely to be chance varia- 
tions, Possible exceptions would be the differ- 
ences in profiles of X’s pretest, posttest, and 
follow-up ratings; and the profile changes for 
A from posttest to follow-up. All other Ds 
are at least twice the magnitude of the 
“error” term. Certainly neither the successive 
changes in B’s profiles nor the change in A’s 
Profile as a result of the experimental program 
seems likely to be anything but real changes 


in attitude. 


Validity of the Technique 
This technique has certain Lp face 
validity. As noted before, Individual à be 
a highly successful television news m : 
caster, enjoying the highest vium E. ings 
of any program of its kind in t pe E E 
initial ratings and successive semantic p T 
Were extremely high and positive, thus 


flecting his esteem. 
sehe e face validity of the tech- 


Tn addition to th à j r 
nique, the postshow group interview ma 
audience provided some evidence whic 


Toborated the profile changes. For example, Ls 
reviewing the tape of the interview, it es 
found that the ratio of negative commente 
Positive comments was nearly 3 to 1 for 


dividual B. This same ratio for A was 1 to 12. 
Certain individuals in the group stated that 
they would not voluntarily watch B on any 
type of program. Derogatory remarks were 
made about his delivery, voice, and “personal- 
ity.” With the exception of one individual, no 
one said they enjoyed him on a news program. 

With respect to Person A, direct evidence 
for a positive change in attitude was obtained. 
One woman said that she had seen him on 
several morning programs and just did not 
care for him, but here she liked him. Much 
emphasis was placed on the maturity which 
he now assumed. One man said he remem- 
bered A blowing bubble gum and could not 
forget the incident, but on the news, he 
stated, A seemed mature and informed. An- 
other woman commented that he seemed to 
control his *little-boyish-eyebrows." 


SuMMARY 


A group of 20 adults, 9 men and 11 women, 
previewed a new television-news program on a 
closed channel hookup in a Y-TV studio. The 
semantic differential was employed to meas- 
ure the change of audience attitude toward 
certain individuals as a result of watching the 
program. Ratings were obtained at three dif- 
ferent times: prior to the program, imme- 
diately following the program, and 15 weeks 
later. The results indicated: 

1. For an individual who did not appear on 
the program, X, but with whom the entire 


40 


group was familiar, the instrument showed 
highly reliable and consistent results. 

2. This technique proved to be extremely 
sensitive to the changes in verbal stereotypes 
which took place as a result of watching the 
preview. The postprogram ratings accurately 
predicted the follow-up ratings. 

3. There was an indication of both “face 
validity” and corroborating evidence from the 


interview attesting to the validity of the 
technique. 


REFERENCES 


Jenkins, J. J., Russert, W, A. & Sucr, G. J. An 
atlas of semantic Profiles for 360 words, Amer. J. 
Psychol., 1958, 71, 688-699, 


MiNbpak, W. A. A new technique for measuring ad- 


Paul M. Kjeldergaard 


vertising effectiveness, J. Market., 1956, 20, 367- 
379. 

Oscoop, C. E. The nature and measurement of 
meaning. Psychol, Bull., 1952, 49, 197-237. 

Oscoop, C. E, & Luria, ZELLA. A blind analysis of a 
case of multiple personality using the semantic dif- 
ferential. J. abnorm. soc. Psychol., 1954, 49, 579- 
591. 

Oscoop, C. E, & Sucr, G. J. A measure of relation 
determined by both mean difference and profile 
information. Psychol. Bull, 1952, 49, 251-262. 

Oscoop, C. E, Sucr, G. J. & TANNENBAUM, P. H. 
The measurement of meaning. Urbana: Univer. 
Illinois Press, 1957. 

Stacner, R., & Oscoop, C. E. Impact of war on à 
nationalistic frame of reference: I. Changes in gen- 
eral approval and qualitative patterning of certain 
stereotypes. J. soc. Psychol., 1946, 24, 187-215. 


(Received March 26, 1960) 


Journal of Applied Psychology 
1961, Vol. 45, No. 1, 41-44 


THE EFFECT OF EXPOSURE TO AN EXTREME 
STIMULUS ON JUDGMENTS OF SOME 
STIMULUS-RELATED WORDS 


BERNARD J. FINE 


Quartermaster Research and Engineering Center, Natick, Massachusetts 


Since the early scaling studies of Volkmann 
(1936) and Sherif (1936), among others, 
many experiments have demonstrated end- 
anchoring effects. The phenomenon generally 
occurs when an extreme stimulus or “anchor” 
is added to a scale as an end-point. Addition 
of the anchor tends to extend the scale so that 
subsequent judgments on the scale are ex- 
pressed in less extreme categories. This has 
been referred to as a “contrast” effect by 
Sherif, Taub, and Hovland (1958). 

The anchoring effect has been shown to 
occur with judgments concerning such diverse 
phenomena as inclinations of lines (Rogers, 
1941), pleasantness of colors (Hunt & Volk- 
mann, 1937), weights (Heintz, 1955), and 
the prestige of various occupations Me 
Garvey, 1943). Johnson (1945) conclude 
that since the phenomenon has been age 
strated in judgments on such a diversity o 
material, it may be taken as a general prin- 
ciple of judgment. 

E the oen effect is a phenomenon 
that may be generalized, then the impi Ea ons 
for the experimenter using judgment sca E : 
estimate the impact of certain ies és 
Stimuli on humans are pe serious but, 
Often, not readily apparent. — . 

It à obvious to most iyesi ed 
With psychophysical phenomena t i^ b i 
ing effects may confound Us nt (1958) 
example, Campbell, Lewis, and e Aon es 
found that when subjects were asxed E Aes 
the pitch of notes in terms ofa pian ws 
board reference system, differences M 
ment of a common note's pitch n int 
depending upon whether the subjec 
ETT onstrated that the inter- 
etit al pares Se bait in a mg dm 
toward the anchor as A on” effect. It occurs 


Tefer to this as an "assimilation horing stimulus 
With ‘the introduction of the anc R 


e ads. 
Immediately adjacent to the stimuli being judge 


41 


heard low or high comparison notes in a 
series. In such instances, however, the experi- 
menter is generally aware of this effect and 
through appropriate ordering of the stimuli 
can prevent its occurence. However, there is 
little evidence that the anchoring phenom- 
enon has been taken into account as a vari- 
able in judgment situations other than the 
psychophysical. This is particularly true in 
studies involving the use of ordinal scales 
before, during, and after exposure of subjects 
to a controlled stimulus in order to evaluate 
the effect of the stimulus on the subjects’ 
judgments. 

The present study was designed to illus- 
trate the necessity for taking anchoring phe- 
nomena into account in judgmental situations 
involving ordinal scales. 

An experiment, reported elsewhere (Fine & 
Gaydos, 1959), in which subjects were ex- 
posed to ambient temperature conditions, 
which, it could be assumed, made them feel 
colder than they had ever felt before, pro- 
vided the setting for this investigation. It was 
assumed that exposure of the subjects to this 
extreme cold stress would have the effect of 
interposing a new, more extreme anchoring 
stimulus at the “cold” end of an ordinal scale 
on which they were required to rate cold- and 
warm-toned words. It was hypothesized that 
as a consequence of their exposure to the cold, 
subjects’ ratings of “cold” words on the 
warm-cold scale would shift away from the 
newly imposed, extreme end-point (“con- 
trast" effect). 


METHOD 
Subjects 


Subjects were 140 enlisted men from Fort Devens, 
Massachusetts. They were brought to the Quarter- 
master Research and Engineering Center daily n 
groups of 12 to participate in experiments conducted 
during the period 3 to 20 March, 195g. 


42 


Procedure 


Each day, the 12 subjects were randomly divided 
into three groups. Six subjects (Group 1) poete 
signed to an experiment in which they — EN x 
to an environment oí 50?F., 50% relative a y, 
and 5-mph wind speed for 75 minutes, clad only a 
shorts and lying virtually motionless, Groups 2 anı 
3, both three-man groups, were assigned to a second 
experiment. Group 2 was exposed to an environment 
of 50°F., 50% relative humidity with no wind for 
1 hour. This group worked on a psychological task 
during the exposure period and was stripped to the 
waist. Group 3 served as the control group for 
Group 2. Subjects in Group 3 performed the same 
task as those in Group 2 but were fully clothed and 
in a warm, comfortable environment, By nature of 
the combinations of ambient conditions and clothing 
worn, Group 1 could be described as being exposed 
to a rather extreme cold stress (for subjects not 
accustomed to such exposure), Group 2 could be 
characterized as being exposed to a moderate cold 
stress, and Group 3 to no cold stress, 

Prior to taking part in these experiments, all sub- 
jects completed a scaling task (Session 1) under 
warm, comfortable conditions. The task can best be 


described by quoting the instructions accompany- 
ing it. 


Below you will notice something that looks like 
a thermometer. Let us suppose that this thermom- 
eter shows how hot or cold people feel instead of 
how hot or cold the temperature is. 

You will notice that there are numbers from 7 
to 16 running up from the bottom to the top of 
the thermometer. One means that it is as cold as 
a person can possibly feel. Sixteen means that it is 
as hot as a person can Possibly feel. 

Now, we all use many different words to tell 
other people how hot or cold we feel. These words 
that we use sometimes mean different things to 
different people. We would like to know what they 
mean to you, 

We have listed 14 of these words alongside of 
the thermometer, We would like you to write next 
to each word the number on the thermometer 
Which shows what the word mears to you in terms 
of feeling hot or cold. 

For example, take the first word, “frozen.” 
Pose you tell your buddy that y 
You would probably mean that you feel as cold as 
you could would write in 7 
beside the word “frozen” because the thermometer 
shows that 1 is as cold a 


Sup- 
ou feel “frozen.” 


6. We would like to know 
uld fall. That is, if you were 
at you were “a li 


all of the words. You 
for more than one won 


Bernard J. Fine 


The words and phrases rated by the subjects were 
“a little warm,” "slightly cold," “extremely Sancta 
“terribly cold," “warm,” “extremely cold, eur 
"a little cool" “very warm," “cool,” “very co A 
“extremely hot,” and “cold” in that order. y a 
structions were read aloud to the subjects an dp 
was taken to insure that all subjects -— d 
task. Aíter completing the scaling task, the su| nd 
participated in the experiments to which they hz 
been assigned. 

Approximately 6 hours later, 2 hours after be 
subjects had completed their tasks, had eaten ag n 
meal, and relaxed for a short while, all subjects ag P 
completed the identical scaling task ge 
under the same comíortable environmental Exc. 
tions as initially. No explanation for the repeti E. 
of the task was viven other than that the ed 
menter would like to have the subjects rate 
words once again. 


RESULTS ? 


Fifteen subjects turned in ratings that e 
incomplete or which indicated that they a 
not understood the instructions. The Y 
from these subjects were eliminated from 5i 
analysis. After the elimination of these e 
jects, the Ns for groups 1, 2, and 3 were 9% 
29, and 38, respectively. 

Since the ds scaled by the subjects wet’ 
not a random sample oí words describiné 
warmth or coldness, it was not feasible E 
determine the variance of the subjects’ v. 
ings and, therefore, it was not possible in 
compute the significance of the differences E 
shifts in judgments between groups. Ho 
ever, it was possible to create a total popu A 
tion of changes in ratings of all words Geo 
Session 1 to Session 2) and to compare t F 
three groups individually with regard to thes 
changes. Accordingly, the before-after changes 
in ratings for the 13 words were treated as 
given finite population over all subjects. d 
subsample consisting of all of the “col 
words was drawn from that population E. 
the ż test was used to show whether or "d 
the sample was a random one. Thus, the wm 
change for the subpopulation of “cold” wor F 
was compared with the mean change for wa 
total population of words for each of the thre 


i e 
groups. The results of these comparisons a" 
Shown in Table 1. 


n 


: ce 
? The author wishes to acknowledge the assistan 
given him in various phases of the data analysis 


Agnes M, Galligan, G. P. Wadsworth, aná 
Monroe. 


Effect of Extreme Stimulus on Judgments 43 


TABLE 1 
COMPARISON OF CHANGE FROM SESSION 1 TO SESSION 
2 or RATINGS or "Corp" WORDS AND ALL WORDS 
FoR THREE GROUPS 


Mean Changes 


Group 1 Group 2 Group 3 
N Change N Change N Change 
- ) 266 —.064 
"Cold" words 406 4-160 — 203 —.010 2 Y 
P 754 4.038 — 377 —.109 494 —.143 
hp Map 3.12 1.73 1.68 
n <.01 
Note—(+) indicates a shift away from the cold end of the 


scale; ( —) indicat hift toward the cold end of the scale. 


If there were no systematic effects of ex- 
posure to cold on subjects’ ratings, it mould 
be expected that the subpopulation of cold 
word changes in ratings would not differ sig- 
nificantly from the total population of changes 
in ratings. It is evident from Table 1 that this 
is the case for Group 2 and Group 3. Bow- 
ever, for Group 1, the subpopulation of “cold 
word changes in ratings does differ signifi- 
cantly from the total population of changes. 
It is further evident that a shift is away 
fr d end of the scale. M 

[og rm that this statistically significant 
shift in ratings of "cold" words by Group 1 
was not due to extreme shifts of only two or 
three “cold” words, the mean shifts for each 
word by groups were computed. This WT 
revealed that six out of seven of the a“ 
words shifted away from the cold end bd m 
scale for Group 1, whereas only one “co! 3 
word out of seven shifted away from the co 
end of the scale for either Group 2 or we 
3. The difference between Group 1 and ei n 
Group 2 or 3 in the number of s i A 
shifting away from the cold end o Ea zal 
is statistically significant (5 < 0 Sing 
Fisher’s exact probability test). 


DISCUSSION 
ate the hypothesis that 

F Mr 
exposure to extreme cold would 3 eee 
ratings of “cold” words away th : 
end ot a warm-cold ordinal scale. eed ar 
sistent with the findings in the poo mee 
anchoring phenomena. Exposure to vae 
cold chad no significant effect on sub) 
ratings. 


The data substanti: 


While the magnitude of the changes in rat- 
ings is small and probably of little practical 
consequence, it should be noted that this is 
more likely than not a function of the struc- 
ture of the scale and the number of words 
rated. Certainly, rating 13 words on a 16- 
point scale minimized the chance for change 
to occur. Had more scale points or fewer 
words been used, shifts might have been con- 
siderably larger. 

The implications of this study for investi- 
gators in similar situations may be illustrated 
by the following example. Subjects in Group 
1 were required, as part of their performance 
in the main study in which they were partic- 
ipating, to periodically indicate on a five-point 
ordinal scale their feelings of warmth or cold- 
ness during the cold exposure. Since the words 
on the five-point scale also appeared on the 
16-point scale of the present study, it is cer- 
tain that at some time, during the exposure, 
the meaning, to the subjects, of the words on 
the five-point scale, changed as the subjects 
became colder. A subjective cooling curve, 
plotted from this data, would have little 
meaning for it would not reflect a change in 
subjects’ feelings alone, but also a change in 
the meaning, to the subjects, of the words 
used in the rating task. 

One way of preventing such occurrences is 
to provide the subject with a physical con- 
comitant to the verbal anchor points before 
the actual experimentation starts. As in psy- 
chophysical studies where the subject judges 
certain stimuli against a standard stimulus, so 
in situations such as the present one a stand- 
ard should be provided. Thus, in the present 
situation, had the subjects all been exposed 
for 75 minutes to the same cold conditions at 
some time prior to the actual experimental 
session, they would have had a referent for 
the anchor point “as cold as you can possibly 
feel” and their subsequent judgments would 
probably not have been confounded by 
changes in the meanings of words. 


SUMMARY 


The study was designed to test the hy- 
pothesis that subjects’ ratings of cold-toned 
words would shift away from the cold end of 
a warm-cold ordinal scale as a consequence 


44 a Bernard J. Fine 


of exposing the subjects to extremely cold 
temperatures. 

Three groups of subjects were exposed re- 
spectively to extremely cold, moderately cold, 
and warm ambient conditions. Before and 
after exposure, all subjects rated 13 words 
describing various degrees of warmth or cold- 
ness on a 16-point warm-cold ordinal scale. 

The results verified the hypothesis; ratings 
of the “cold” words for the group exposed to 
the extreme cold shifted away from the cold 
end of the scale. The results are consistent 
with previous research on end-anchoring 
effects. 

Implications for the researcher using or- 
dinal scales in situations where subjects are 
exposed to extreme stimuli with which they 
have had little experience are discussed. 


REFERENCES 


CAMPBELL, D, T., Lewis, N. A, & Hunt, W. A. 
Context effects with judgmental language that is 
absolute, extensive, and extra-experimentally an- 
chored, J, exp. Psychol., 1958, 55, 220-228. 


Fme, B. J., & Gavpos, H. F. Relationship between 
individual personality variables and body tempera- 
ture response patterns in the cold. Psychol. Rep, 
1959, 5, 71-78. 

Hzmz, R. K. An inquiry concerning the effects of 
anchoring points upon judgment: I. The effect of 
remote anchoring points upon the judgment of 
lifted weights. IL Immediate vs. pre-established 
frames of reference in the judgment of attitudinal 
stimuli. Dissertation Abstr., 1955, 15, 1110-1111. 

Hunt, W. A, & Vorxmann, J. The anchoring of an 
affective scale, Amer. J. Psychol., 1937, 49, 88-92. 

Jounson, D. M. A systematic treatment of judg- 
ment. Psychol. Bull., 1945, 42, 193-224. 

McGarvey, H. R. Anchoring effects in the absolute 
judgment of verbal materials. Arch. Psychol., NY, 
1943, No. 281. 

Rocers, S. The anchoring of absolute judgments. 
Arch. Psychol., NY, 1941, No. 261. 

Suer, M. The psychology of social norms. New 
York: Harper, 1936. m 
Suerr, M., Tavs, D., & Hovrann, C. I. Assimilation 
and contrast effects of anchoring stimuli on judg- 

ments. J. exp. Psychol., 1958, 55, 150-155. 

VoLKMANN, J. The anchoring of absolute scales. 

Psychol. Bull., 1936, 33, 742-743. 


(Received March 28, 1960) 


Journal of Applicd Psychology 
1961, Vol. 45, No. 1, 45-49 


AN EXPERIMENTAL STUDY OF THE EFFECTIVENESS 
OF BRAINSTORMING* 


EDITH WEISSKOPF-JOELSON anp THOMAS STEPHAN ELISEO 2 


Purdue University 


The problem of creativity has become the 
focus of much attention during recent years. 
“Creativity” as used in this paper can be 
briefly defined as “the production of new 
ideas.” Especially the study of the conditions 
under which people are most likely to pro- 
duce creative ideas has aroused considerable 
interest. The policies of educators, of adminis- 
trators in business or industry, and of others 
may be strongly affected by better knowledge 
of such conditions. This may apply to rela- 
tively permanent conditions such as “the emo- 
tional atmosphere” among co-workers, or to 
temporary conditions such as attitudes pro- 
duced by the type of instructions with which 
a problem is presented to the potential crea- 

r of ideas. 

s present research is concerned with such 
reases in productivity produced 
used technique of “brainstorm- 
1957). When engaging in brain- 
storming a group of people is presented wiih 
a problem, and with instructions such as : e 
following regarding the way in which its solu- 


tion should be sought: 


Group work is good only if everyone lets Mast 
go, and thinks and says whatever idea ga : n 
no matter how silly it may seem. Some e et 
most ridiculous idea may be better than — EUN 
"Therefore, try to avoid criticizing your n JT khe 
one else's ideas. If anyone 15 criticizing I w de 
him to stop. Laughing and sms are oe 
criticism. We want a lot of ge a ees pa 
the better chance there is to find the diee opez 
Don't worry about how good the ideas à er iue 
be evaluated later. When you try to = ioa gun 
at the same time, nothing much agen on ovt X 
get too bogged down In judging, = A RN 
creating. Also, it has been found tha! pon ce ed 
sometimes created by improving on o 


temporary inc 
by the widely- 
ing" (Osborn, 


1 The paper is base 
graduate school of Pur 
diana, by the n m 
the requirements for à r 
The study was directed by the uim 

? Now at Veterans Administrati 


anon, Pennsylvania. 


d on a thesis presented to the 
due University, Lafayette, In- 
hor in partial fulfillment of 
MS degree in psychology. 
author. 
Hospital, Leb- 


by combining two or more ideas into new ones. Are 
there any questions? Let me summarize the way we 
will operate: 1. Criticism is ruled out. 2. The wilder 
the idea the better: let yourself go and say whatever 
comes into your head. 3. Quantity is wanted. 4. Com- 
bination and improvement of ideas is desired. 


These instructions were used in the present 
experiment. In other cases of brainstorming 
the instructions were paraphrased somewhat 
differently but in each case they were elabora- 
tions of the four italicized points which are 
summarized at the end of the previous para- 
graph. 

It is the purpose of this study to compare, 
as to quality as well as quantity, the ideas 
produced by instructions suggesting the sus- 
pension of criticism and the ideas produced 
by instructions suggesting critical censorship, 


METHOD 


Subjects. The subjects were 42 undergraduate col- 
lege students at Purdue University. They were ran- 
domly divided into six groups, cach group consisting 
of seven students, four of which were males and 
three of which were females. The groups were not 
matched for intellectual level because it was assumed 
that random assignment would cancel out intellectual 
differences to a large extent, and that the remaining 
uncanceled differences would be of negligible mag- 
nitude because of the narrow margin in intellectual 
ability among college students. 

Task. The task with which the subjects were pre- 
sented was the invention of brand names for three 
kinds of articles designed especially for Purdue stu- 
dents, namely, a cigar, a deodorant, and an auto- 
mobile. There are several reasons why these tasks 
seemed especially adequate for the purpose of the 
present study: (a) The creating of brand names and 
similar activities pertaining to the fields of advertis- 
ing, publicity, or promotion constitutes the problem 
area for which brainstorming was originally devel- 
oped, notwithstanding the fact that its use was later 
extended to many other problem areas. (b) The eval- 
uation of the effectiveness of brand names constitutes 
somewhat less of a problem than the evaluation of 
suggested solutions to other problems. Thus, quality 
ratings by judges who are Purdue students seem 
relatively adequate measures of the effectiveness of 
brand names created to impress Purdue students, It 
would be more difficult to find equally adequate 


46 Edith Weisskopf-Joelson and Thomas Stephan Eliseo 


measures of the effectiveness of ideas regarding, for 
example, the solution of technical, social, or personal 
problems. (c) Creating brand names is a relatively 
“simple and specific" problem that is “not too broad 
and can be "stated clearly" (Osborn, 1957). Such 
problems have been said to lend themselves especially 
well to brainstorming procedures. p 

Procedure. The junior author met with the six 
groups of subjects for one session each. With three of 
the six groups (to be called “the noncritical groups” 
in the following discussion) he started the session 
with the brainstorming instructions quoted above. 
With the other three groups (to be called “the criti- 
cal groups”), he started the session with the follow- 
ing comments: 


No idea is ever worth anything unless it has 
been well thought out; in doing this you have to 
think about the problem and see it in all its as- 
pects. It is only when you do this that good ideas 
are produced. We want to have a number of ideas; 
but we want good, practical ideas. Let's try to 
avoid stupid or Silly ones. Also, it has been found 
that good ideas are sometimes created by improv- 
ing on other ideas and by combining two or more 
ideas into new ones, Are there any questions? Let 
me summarize the way we will operate: 1. Think 
clearly and logically. Try to see all Lhe aspects of 
a problem. 2. We want good, practical ideas. 3. We 
want a number of ideas from which to later judge 
the better ones; however, the emphasis is on qual- 


ity not on quantity. 4. Combination and improve- 
ment of ideas is desired, 


Thus it was attem 
tions of the critic: 
the ones of the 
the instructions was ide 


instructed to exert it, 
Following the two kinds of 


groups were given the following 
task: 


introductions all 
explanation of their 


men. Let's create some 
Purdue male, 


gested that any additional time allotment would re- 
sult in a negligible number of additional ideas. 

Evaluation. Next, the brand names were evaluated 
by 150 Purdue male undergraduate students none of 
whom had participated in the original experiment. 
It would have required too much of each judge's 
time to rate the total 902 different brand names. 
Therefore, each judge was asked to rate only about 
one third of the total number of responses: this 
allotment of brand names to the judges was done in 
such a way that each name was rated by 50 judges. 
The names were presented in a different random 
order for each judge. 

The rating was performed on a five-point scale ac- 
cording to the following written instructions to the 
judges: 


Brand names are an important consideration in 
selling a product. Some names are appealing t0 
most of the potential customers, others are not. 
This afternoon, we would like you to rate some 
names for a product. Assume that you wished to 
buy the product; then rate the names according to 
how much it attracts you. As you can see on the 
first page of your sheets, A means the name co 
VERY GREAT ATTRACTION, B—a GREA 
ATTRACTION, C—SOME ATTRACTION, DE 
LITTLE ATTRACTION, and E—VERY LITTL 
ATTRACTION. Please place a circle around either 
the A, B, C, D, or E according to how much the 
name attracts you. Please work quickly and rate 
according to your first impression of the name. 


On the basis of this evaluation a quality score hs 
computed for each brand name, defined as the tot? 


O—— Non- critical 
X---- Critical 


FREQUENCY 


d — 
I 26 10) 74 48 22 0 
QUALITY SCORE 

Frc. 1, Cumulati: 


s ise d 
: ve frequencies of the critical an! 
noncritical res 


Ponses according to quality of the 
Product: Cigar, 


152 


Effectiveness of Brainstorming 47 


o——Non-critical 
x---- Critical 


240 


200 


FREQUENCY 


40 


o— 


Be wo 


84 58 32 6 
QUALITY SCORE 


Fic. 2. Cumulative frequencies of the critical and 
noncritical responses according to quality of the 
product: Deodorant. 


weight of the A and B ratings, whereby a weight of 
n to the A ratings and a weight of four 
ility of the scores was 
O raters for each group 
ly selected groups of 25 


five was give t 
to the B ratings. The reliab: 
determined by dividing the 5 
of responses into two random x 
raters cach and by correlating the ratings uem 
by these two subgroups of raters. The overa! € 
son product-moment correlation for all responses 
.72, significant at the 1% level. 
ON 
ANALYSIS, RESULTS, AND DISCUSSION 


s obtained under the critical 


se 
The respon Bech 


and noncritical cine fon ii compar 
i rell as quality. ; 
onn DUM the noncritical im 
produced a greater number of responses um 
the critical condition. The probabit = 
differences, equaling oF exceeding d € 
served ones, between the ame o x 
critical and critical responses, woul uri 
Chance, were less than 003%, for eac 


three products. 

Figures 1, 2; 
frequency distri 
As can be seen, th 


distributions of the non 
names are very similar for the three products. 


The figures suggest that there is no appre- 
ciable difference between the number 0 


and 3 show the cumulative 
butions of the quality scores. 
e relationships between the 
critical and the critical 


sponses given by the two groups at the left 
end of the distribution where the quality of 
the responses is extremely high. With decreas- 
ing quality the difference increases. Thus, the 
preponderance of responses from the non- 
critical group over the ones from the critical 
group owes its origin to the large number of 
responses of relatively low quality in the non- 
critical group. 

As can be seen from Table 1, the mean 
quality score of the critical responses is 
greater than the mean quality score of the 
noncritical responses for all three products, 
and this difference is statistically significant 
beyond the 5, 1, and 0.5% level, respec- 
tively, for “Automobile,” “Deodorant,” and 
the total of all responses. 

Table 2 gives further indications of the 
qualitative superiority of the critical re- 
sponses. It indicates that the critical responses 
received relatively more quality scores above 
the overall mean than the noncritical re- 
sponses. 

Figures 4, 5, and 6 show the cumulative 
relative frequency distributions of the two 
groups of responses. As can be seen, the rela- 
tionships shown in Figures 1, 2, and 3 are 
drastically changed when the total number of 


O— — Non critical 


eee Critical 


150 


FREQUENCY 


100 


"MT 9 3 63 35 T 
QUALITY SCORE 


Fic. 3. Cumulative frequencies of the critical and 
noncritical responses according to quality of the 
product: Automobile. 


48 Edith Weisskopf-Joelson and Thomas Stephan Eliseo 


TABLE 1 


COMPARISON OF THE MEAN QUALITY SCORES OF THE NONCRITICAL 
AND CRITICAL RESPONSES, 


Standard à 
Product Condition N Mean SD Error 
i NC 121 41.49 23.78 
— Cc 65 45.26 27.83 4.07 93 
Deodorant NC 218 38.44 25.94 . 
Cc 80 48.14 31.66 3.95 2.46** 
Automobile NC 326 45.40 26.59 : 
C 92 51.83 33.91 3.83 1.68 
"Total NC 665 42.41 26.07 " 
c 237 48.78 31.70 2.29 Zor 


* Significant at 5% level (one-tailed test). 
** Significant at 1% level (one-ta ed test). 
*** Significant at 0.5% level (one-tailed test). 


Tesponses for both response groups is con- 
trolled. There is a tendency for the critical 
Curve to exceed the noncritical curve except 
for responses of extremely low quality. 

What practical conclusions can be drawn 
from these comparisons, assuming that their 
validity may extend beyond the scope of this 
study? 

The fact that the noncritical curve tends to 
rise above the critical curve except for a small 
number of responses of highest excellence 
(Figures 1, 2, 3) has important practical im- 
plications, If it is the Purpose of problem 
solving to produce a specific number of ideas 
of highest possible quality, these best ideas 
will tend to be of higher or equal quality 
when the noncritical method is used; this is 
the case because a horizontal line will tend to 
cut the noncritical curve at a point closer, or 
equally close, to the ordinate than the point 
at which it cuts the critical curve. A possible 
exception to this tule occurs when a very 


TABLE 2 


ARISON OF RELATIVE FREQUENCY OF RESPONSES 
GREATER THAN THE OVERALL MEAN 


Comp. 


— 
Condition Observed Expected df x 
NC 258 


c 120 


378 


* Significant at 5% level, 


small number of responses is required. There 
the two methods tend to be equally prome 
ing, with a slight advantage of the ani 
method. The cumulative number of noncri i 
ical responses begins to be significantly highe 
than the cumulative number of critical E) 
sponses within and below the quality interva s 
with the midpoint 48 for “Cigar,” 58 fo 


o— Non- critical 
* --Critical 


FREQUENCY 


26 100 74 48 


QUALITY SCORE 


" n n- 
€ cumulative frequencies of the nO 


ical responses according to quality 
Cigar. 


22 


Fic. 4. Relativ 
critical and crit 
of the product: 


Effectiveness of Brainstorming 49 


O— Non-critical 
X--- Critical 


FREQUENCY 


j 136 lO 84 58 32 6 
QUALITY SCORE 


Fic. 5. Relative cumulative frequencies of the non- 
Critical and critical responses according to quality 
of the product: Deodorant. 


“Deodorant,” and 91 for * Automobile" (be- 
yond the 1% level). However, we must take 
into consideration that it is more time con- 
suming to judge a large number of noncritical 
responses than a smaller number of critical 
responses as to quality; it takes longer to 
identify the responses of high quality among 
the noncritical responses than among the crit- 
ical responses. To what extent the disadvan- 
tage of this added time investment would 
detract from the advantages of the noncritical 
method would depend on the specific sit- 
uation. 

On the other hand, according to the results 
of this study, the advantages of the — 
ical method appear to be based mainly m e 
larger number of responses produced. Tal = 
land 2, and Figures 4,5; and 6 deus » 
the data support a decision in favor of the 
Critica] method if the number of critical re- 


of [2 a er 

ases can be increas d to equi ] the numb 

noncritical responses. However, it must 
that it 


into consideration 


again be taken ubjects, to 
: 


Would require more time, or more S 


o— Non-critical 
*-- Critical 


FREQUENCY 


"147 i9 9| 63 35 7 
QUALITY SCORE 


Fic. 6. Relative cumulative frequencies of the non- 
critical and critical responses according to quality 
of the product: Automobile. 


produce a given number of responses by the 


. critical method than to produce the same 


number by the noncritical method. 

Finally, it should be kept in mind that the 
results of this study are a function of the 
criteria by which the responses were eval- 
uated. If the evaluation is based on ratings. 
the relative merit of brainstorming might de- 
pend on a variety of attitudinal character- 
istics of the raters. The fact that such con- 
ventional names as “Sportsman,” “Esquire,” 
or “Century” were among the three most 
highly-rated brand names suggests that the 
imaginative ideas of the brainstormers were 
wasted on the conservative taste of the judges. 
Perhaps judges and consumers with a greater 
preference for unusual brand names could 
throw a more favorable light on brain- 
storming. 


REFERENCE 


Ossory, A. F. Applied imagination. (Rev. ed.) New 
York: Scribner, 1957. 


(Received March 29, 1960) 


J l oj Applied Psyckology 
1961, Vol. 43, No. 1, 50-54 


OUTPUT RATES AMONG MACHINE OPERATORS: 


HI. A NONINCENTIVE SITUATION IN TWO LEVELS 
OF BUSINESS ACTIVITY 


HAROLD F. ROTHE axo CHARLES T. NYE 


Fairbanks, Morse and Company, Beloit, Wisconsin 


Previous studies (Rothe, 1946, 1947, 1951; 
Rothe & Nye, 1958, 1959) have led to some 
hypotheses relating the consistency of output 
of industrial operators to the adequacy of the 
financial incentives in their work situations. 
These hypotheses are: 


[1] ... the incentives to work may be considered 
ineffective when the ratio of the range of intra- 
individual differences is greater than the ratio of the 
range of interindividual differences (Rothe, 1946, p. 
326). 

[2] . . . if the intercorrelation of output rates for 
two periods closely related in time is less than .70, 
the incentivation is not highly effective, while inter- 
correlation higher than .80 indicates effective incenti- 
vation (Rothe & Nye, 1958, p. 185). 


This paper presents output data on other 
groups of industrial operators and relates 
these data to those hypotheses. These data 
lend some support to one of the hypotheses, 
and they tend to weaken the other one. These 
data are interesting in their own right partic- 
ularly because there are two sets of data ob- 


tained from one plant at two different times 


and in two different levels of business activity. 


BACKGROUND oF THE Stupy 


The employees were machine operators in Plant C 
(to distinguish it from Plants A and B, previously 


er small plant in a city in 


-week period in 1960. 
t C was a fairly new 
ust as Plants A and B 
mic conditions at the 
de, Plant C was also 
; Conditions in 1958, In 
3 njoying favorable business con- 
ditions. 

Plant C was of particular interest because, al- 
though it had .no financial incentive system, it had 


had a practice of insisting that all employees reach a 


is- 
certain amount of standard production or be be 
ciplined—perhaps discharged. Language in the co y 
tive bargaining agreement permitted this DM. 1 
a matter of fact, this practice was said to be in elec 
at the time of the study; but as the data reveal, E. 
was not true. The standard production of each E. 
ployee was posted daily, but there was no e 
reward for high production, nor discipline for § 
standard production. 


Data 


The weekly average output (in which E 
equals standard) and the number of € 
ployees for whom data were available € i 
week of the 1958 study are shown in Tabi 4 
The weekly efficiencies varied from 90.1 
to 94.29% and showed no trend with - A 

The weekly average output and the e 
of employees for whom data were availa 
each week in 1960 are shown in Table 2- | 
Table 2 the weekly efficiencies varied A. 
92.91% to 96.01% and again showed 60 
trend. The average output was higher in 1 g 
than in 1958 (95.2 versus 92.5), and althoy | 
the number of employees varied some" rre 
from week to week during each period, th 


TABLE 1 
WEEKLY AVERAGE OUTPUT (PERCENTAGE 
PERFORMANCE OF STANDARD) FOR GROUP 
or MACHINE OPERATORS, 1958 


Percentage Number of 


Week Ending Performance Employees 


April 27 93.24 5S 
May 4 91.37 43 
11 91.06 38 
18 91.83 38 
25 93.47 34 
June 1 94.29 36 
8 93.41 33 
15 93.79 33 
22 91.87 33 
29 93.45 26 
July 6 90.13 37 


Output Rates among 


were many more employees in 1960 than 
there were in 1958. This latter occurrence 
indicates that business was much better in 
1960 than it was in 1958. 

In all studies of output rates in this series 
the distributions of the rates of the employees 
for any week were approximately normal, al- 
though the distributions for the machine op- 
erators in Plant B were slightly skewed 
(Rothe & Nye, 1959). In Plant C most of 
the distributions for the 1958 period were 
distinctly not normal; for the 1960 period the 
distributions were more nearly approaching 
normality. In fact, the shapes of the distribu- 
tions are probably the most interesting parts 
of this study. For that reason the complete 
distributions of all the obtained data are pre- 
sented here in Table 3 for 1958 and Table 4 
for 1960. 

The correlation of the employees’ perform- 
ances for 1 week with their performances for 
the folowing week was calculated by the 
Pearsonian r method. The median r thus ob- 


Machine Operators 


TABLE 2 
WEEKLY AVERAGE OUTPUT (PERCENTAGE 
PERFORMANCE OF STANDARD) FOR GROUP 
or MacHINE Operators, 1960 


Percentage Number of 


Week Ending Performance Employees 


Jan. 10 96.01 58 
17 94.46 61 
24 96.54 59 
31 95.69 58 
Feb. 7 95.01 66 
14 95.74 54 
21 94.62 59 
28 92.99 68 
Mar. 6 95.04 64 
13 95.77 65 
20 95.49 66 
27 94.86 50 


tained was .48 with a range of .29 to .80 for 
1958. The median r for 1960 was .53 with a 
range of .17 to .72. 


TABLE 3 
FREQUENCY DISTRIBUTIONS OF WEEKLY AVERAGE OUTPUT OF ALL 


OPERATORS FOR ALL ELEVEN WEEKS, 1958 


Week Ending 


Percentage April May 


July 


Performance 
to Standard . 


w 
D 


11 18 


27 


29 


110 
105 
100 
95 
90 
85 
80 
75 1 
70 
65 1 
60 
'55 
50 
45 
40 
35 
30 
25 1 


m m h oN o 
pu 


T3o— dO 30 Ov t 
wR now. 


No. of 
Oftrators 55 43 38 38 34 


up oop 


ee E 


36 33 33 33 


SE O 


en 


[e 


Harold F. Rothe and Charles T. Nye 


TABLE 4 


U) ) BUTIO: WEEKLY AVERAGE O s | 

Q n " ERATORS 

FREQUENCY DisTRIBUTIONS OF W RAGE OUTPUT OF ALL OPER. | 
FOR ALL TWELVE WEEKS, 1960 


Week Ending 
? y March 
Percentage January February ar - 
; 20 27 
re iren 10 17 24 31 7 14 21 28 6 13 
115 a 1 
10 1 1 1 j 
10s 3 2 2 1 1 2 1 by 
100 10 9 13 T 14 10 8 9 9 8 ba 26 
95 27 27 23 28 27 23 26 2 30 37 is is 
90 13 15 12 18 14 13 12 15 13 H i 5 
85 5 6 3 2 2 6 10 6 3 A 
1 : H 
80 1 1 4 1 3 4 1 4 2 z ; " 
75 1 1 1 1 2 : i 
70 3 i 2 2 1 
65 1 1 1 2 2 1 
60 1 
55 
50 1 
45 
40 
No. of ] 59 l 
Operators 58 61 59 58 66 54 59 68 64 65 66 
"T 
: " "e 2 5 interindividual rat 0l 
The ratio of the range of intra-individual centives. That is, the i eene c ah cad 
differences was obtained here, as in the earlier exceeded the intra-individual ratios ood 
studies, by comparing the output for the most year, and this relationship has been hypo* p 
productive week with the output for the least 


productive week for each employee. The me- 
dian of these ranges was 1.18 in 1958 and 
1.14 in 1960. The ratio of the range of inter- 
individual differences was also obtained in the 
usual manner by comparing the performances 
of the most productive employee and least 
productive employee for each week. This ratio 
was 1.96 in 1958 and 1.57 in 1960. 


Discusston 


The ratios 
individual differences are opposite to what 
the other 
ess of in- 


sized as indicating effective incentive ae 
This hypothesis clearly needs more evi ei d| 
or more refining since it is strongly susp! w 
that the incentivation in this situation | 
not very effective. ese 
The most striking feature of the a of 
study was the skewness of the distributie” ai 
output, particularly in 1958. This is ae D 
Table 3. This phenomenon occurred duri ye 
period of poor business when employees 
being laid off and when the average q^ 
tion was only 92.5% performance to $ 
ard. In 1960, when business was beiter 
employees had been recalled to wem 
average production rose to 95.2 and the VO? 
bility and skewness of the weekly distribu" oly 
were reduced as the instances of ent 
low productivity practically disappe sg 
Skewness of output has long been hyp vt^ 
Sized as indicating "restriction of Mert 
Ford (1931) and Yoder (1942) sugge 


v. 
ue 
dus 


d 
di 
roa 


Output Rates among Machine Operators 53 


that negatively skewed distributions indicated 
restriction, but they had no data to support 
their hypotheses. Bliss (1931) using data 
from some “bench workers" found output to 
be skewed positively, and he attributed this 
to “lack of motivation.” More recently Gaudet 
and Livingston (1959) wrote that a nega- 
tively skewed distribution is often found for 
incentive piece workers, and this indicates 
“restriction of output.” They also present no 
data to support this view. 

In the light of all this theoretical discussion 
of skewness of output distributions, it is per- 
haps all the more noteworthy that the present 
study is the only one in the authors’ series of 
studies of output rates in which a clear and 
important skewness has been found. In fact, 
only in one previous study was there any 
indication of skewness, and in that one there 
was an incentive system in effect, and it 
seemed to be effective (Rothe & Nye, 1959). 

It seems desirable to illustrate the skewness 
found here more effectively. For that purpose, 
data for the week of the largest number of 
employees of Table 3 have been plotted on a 
histogram which appears as Figure 1. 

It cannot be concluded definitely that the 
incentives to produce were ineffective here, 
but that is strongly suspected. One thing is 
certain: there was no financial incentive in 
effect, On the negative side, it is also certain 
that substandard producers were not disci 
Dlined as they might have been (ie., q 
ished”), It is quite likely, therefore, that the 
incentives to work were not very great, and 
this was accompanied by a negatively skewed 
distribution of output rates; especially in 
1958, 

It was hoped to t 
that both fast and : 
Commor de, that fast pro 4 
vatier rs slow producers, that the dis- 
tribution of output for fast producers 1S pin 
lively skewed and the distribution for : ow 
Producers is negatively skewed, and that these 
Phenomena may serve as an objechive Ee] 
Ure of “restriction of output Ph ae: m , 
1922). The distributions of the “best a 
and the “poorest” five employees were ae 

Or each year, but there Were insufficient data 
to permit the drawing of any conclusions. 


est Bedford’s hypothesis 
Jow producers have a 
ducers show less 


Wunder of Employees 
[] 


oe E eee oe is 
10 20 30 40 50 7 8) 9 10 110 


Percentage Performance to Standard 


Fic. 1. Output rates for 55 machine operators for 1 
week in a nonfinancially-incentivated shop, 1958. 


This study of machine operators was char- 
acterized by strongly skewed distributions of 
output rates in 1958, and less skewed distri- 
butions in 1960. It tends to support one hy- 
pothesis previously presented, and to weaken 
another hypothesis. This paper presents, for 
the first time, significantly negatively skewed 
distributions of actual output rates. These oc- 
curred in a situation apparently accompanied 


. by a lack of incentivation (either positive, as 


money, or negative, as disciplinary action). 
They occurred in a period of poor business 
conditions when employees were being laid 
off. Two years later, when business had im- 
proved, and laid-off employees were recalled 
to work, the productivity was higher, had less 
variation, and was less skewed in distribution. 
This paper also shows, as do all papers in this 
series of industrial operators, the need for 
further studies of output, either as a phenom- 
enon that is inherently valuable, or because 
output is used as a criterion against which the 
effectiveness of other variables is sometimes 
measured. This paper further shows the need 
for describing the economic conditions or 
level of business existing at the time that in- 
dustrial studies are made. It supports the 
theory presented by other writers that a nega- 
tively skewed distribution indicates “restric- 
tion of output.” Unlike other papers which 
proposed this theory but had no data, this 
paper presents data showing negatively 
skewed actual output data in a situation 
where the incentives to work were probably 
not very effective. 


54 Harold F. Rothe and Charles T. Nye 


REFERENCES 


Beprorp, T. The ideal work curve. J. industr. Hyg., 
1922, 4, 235-245. 

Buss, E. F., Jz. Earnings of machine tenders and of 
bench workers. Personnel J., 1931, 10, 102-107. 
Forp, A. A scientific approach to labor problems. 

New York: McGraw-Hill, 1931. 

Gavner, F. J., & Livincstoy, D. G. How to make 
sure you’re understood. Superv. Mgmt., 1959, 28- 
35. 

Rotue, H. F. Output rates among butter wrappers: 
IL Frequency distributions and an hypothesis re- 
garding the “restriction of output.” J. appl. Psy- 
chol., 1946, 30, 320-327. 3 


Rorue, H. F. Output rates among machine opera- 
tors: I. Distributions and their reliability. J. appl. 
Psychol., 1947, 31, 484—489. 

Rorur, H. F. Output rates among chocolate dippers. 
J. appl. Psychol., 1951, 35, 94-97. 3 

Rotue, H. F., & Nve, C. T. Output rates among coil 
winders. J. appl. Psychol., 1958, 42, 182-186. 

Rotae, H. F, & Nye, C. T. Output rates among 
machine operators: II. Consistency related to 
methods of pay. J. appl. Psychol., 1959, 43, 417- 
420. 

Yoner, D. Personnel management and industrial re- 
lations. New York: Prentice-Hall, 1942. i 


(Received April 9, 1960) 


Journal of Applied Psychology 
1961, Vol. 45, No. 1, 55-58 


THE NEWCOMER IN OPEN AND CLOSED GROUPS: 


ROBERT.C. ZILLER, RICHARD D. BEHRINGER, ax» MATHILDA J. JANSEN 


Fels Group Dynamics Center, University of Delaware 


While group stability has long been recog- 
nized as a major dimension in organization 
theory (Hemphill, 1949), relatively few lab- 
oratory experiments have been reported con- 
cerning group membership changes. The lab- 
oratory study reported here was designed to 
investigate the reactions to a newcomer by 
groups which had been apprized of various 
imminent membership changes (open groups) 
and groups which did not anticipate member- 
ship changes (closed groups). The study was 
also designed to explore the relative power of 
the newcomer and the regular group member. 

Simmel (Wolff, 1950, p. 402-408) and von 
Wiese and Becker (1932) have speculated 
extensively concerning the consequences of 
the addition of a new member to the com- 
munity or society. In general their proposi- 
tions describe the hostility which attends the 
advent of a newcomer to rather stable groups. 


Becker (1950, p. 50) also suggests that the — 


newcomer is deeply disliked by the regular 
members since he represents the forces of 
Change (also see Heider, 1958, p. 193-194). 
However, the afore-mentioned observations 
tend to concern implicitly only what is here 
referred to as closed groups. The open-closed 
group concept imposes a new dimension on 
the newcomer phenomenon and renders the 


earlier analyses incomplete. 


D METHOD 
Subjects 

, Sixty-four male socia ; 
In the experiment as a requiri 


“ach of the 32 three-person ex 
Composed of two naive subjects an' 


] science students participated 
ement of their courses. 
perimental groups Was 
d a confederate. 


Pr Ocedure 


Two nai bjects and a 
naive subjects eau 
‘nother naive subject, were seated around a am 
NOte-passing device in which they were da 
Tom each other's view but could communi 


e A 
Xchanging notes. 
E PEU 


confederate, ostensibly 


m " report presented 
This paper is an Li tings in 


t 
A the Eastern Psycholog 
Atla 


[n 


In the first phase of the experimental procedure, 
the groups were involved in two tasks, a Meier Art 
Judgment item and a numerosity task. This phase of 
the experiment requiring approximately 25 minutes 
was ostensibly part of a group decision-making ex- 
periment but in reality was designed to facilitate the 
development of the group qua group. 

Before the next and critical problem was intro- 
duced, the group was told that since the experiment 
also concerned the relationship between group size 
and the quality of group decisions, another member 
would be added to the group. The new member (an- 
other confederate) was introduced to the regular or 
core members and “instructed” as to the operation of 
the note-passing apparatus. 

The third problem: was again a numerosity task. 
One thousand and fifty black dots were displayed on 
a white form similar to a ping-pong paddle. The 
form was displayed for only 2 seconds. When the 
members had submitted their individual estimates as 
to the correct number of dots, the group was per- 
mitted “ten minutes in which to reach a group deci- 
sion as to the correct number of dots.” Finally, when 
the group estimate had been submitted, the individ- 
ual members submitted private estimates of the 
“number of dots they personally really thought there 
were on the card.” 


Experimental Conditions 


The 2 X 2 factorial design allowed the comparison 
of the relative influence (power) of the newcomer (a 
confederate) and a regular member (actually another 
confederate) under open- and closed-group condi- 
tions. The open condition was created by stating at 
the outset of the experiment and again after the first 
task had been completed that “during the course of 
the experiment either a member will be asked to 
leave the group or a new member will join the 
group.” Under the closed conditions this statement 
was eliminated and it was presumed that the groups 
expected to remain intact during the entire session. 

The experimental conditions designed to test the 
relative influence of a regular member (a confed- 
erate) and the newcomer (another confederate in the 
same group) were established by providing either the 
newcomer or a regular member with the correct 
answer and procedure for arriving at that answer to 
the critical task. When the newcomer proposed the 
correct method and answer, the regular member pro- 
posed essentially the same method but a different 
answer. The conditions were reversed when testing 
the relative influence of the regular member, All con- 
federates were enjoined to avoid the “Jeadership” 
role, that is, the function of submitting the final 
group estimate. It should be noted that the task was 
of such a degree of complexity that the naive mem- 


56 R. C. Ziller, R. D. Behringer, and M. J. Jansen 


bers could neither estimate the correct number of 
dots with confidence nor determine or happen upon 
the correct approach. 


Measures 


The dependent variables included measures of 
communication direction (number of messages sent 
by the naive subjects to the newcomer and the core 
confederate), group influence, group satisfaction, and 
the relative ratings of the confederates by the naive 
group members. It was presumed that each of these 
measures reflected to some degree the extent to which 
the newcomer was assimilated by the original group. 

Two measures of influence were involved.? The 
first was derived from the group mean (excluding 
the confederates) of the weighted responses to the 
following item: “I changed my individual estimate 
of the number of dots a great deal as a result of the 
group discussion.” The alternatives were arranged on 
a six-point scale varying from “agree very much” to 
“disagree very much.” This index represents the in- 
dividual member's perceived change or influence, 

The second measure of influence was simply the 
difference between the group decision and the cor- 
rect answer. Because of the inherent positive skew- 
ness of the distribution of these scores, it was neces- 
sary to apply the cube root transformation. 

An objective measure of the group’s satisfaction 
with the group estimate was also derived from the 
dot data. The mean difference of the members’ post- 
Eroup decision estimates from the group's decision 


was calculated and again the cube root transforma- 
tion was applied, 


Finally, the naive 


3. This person 
throughout the me 


RESULTS 

The data were sub, 
variance described b 
published report, 19 
for combined neste 


Examination of the results in Table 1 indi- 
cates that there is a significant difference be- 
tween open and closed groups (p < 01) in 


2 These are the sa; 
methodological stud 
inger, 1959), 


8H. Simon, personal communication, 1957. 


| 
TABLE 1 | 
ANALYSIS OF VARIANCE OF RATINGS | 
OF PLEASANTNESS 
D Random 
WE ssa 
Source df MS F Error Term 
CA — 1 25080 566 D,AB 1 
B 1 4.6800 D, AB | 
c 1 9.6800 419  CD,AB —— 
AB 1 — 0521 D, AB f 
AC 1 4.7663 CD, AB í 
BC 1 .2664 CD, AB 
ABC 1 — 10372 CD, x 
D, AB 24 4.4273 E, Y. , 
CD, AB 24 2.3125 CE, D? 
E, DAB 28 2.9732 
CE,DAB 28 1.3661 
jon? BA 
e.—" A" refe h ^n-closed group dimension: of 
to the condition wise Hie ete esed rou confederato) ri 
mer (another confederate) possesses the sporren, p" t 
ted member (regular member or newcomer) i tesi 


^ » adera! 4 
gs by the naive members of the two confede bet 
atings of the confederates by a single naive 


| 


d 
ratings of pleasantness with reference to y 
confederates. The higher ratings appear "a 
open groups. In addition, when a less her? 
servative error term is used (CE, DAB) "in | 
is a significant difference between the ra co? 
of regular confederates and newcomer "ma 
federates (p < .05), the newcomer ra | 
being lower. , of 
In order to conserve space, the tables nc 
analyses are omitted. However, with pepe e 
to "quality of ideas" the ratings of both C^. 
federates were significantly higher again () 
open as compared with closed groups yal? 
< .05), and higher for the regular confede f 
than the newcomer confederate (p< 05; qv’ 
DAB, error term). With regard to “i? m 
ence,” again the regular confederate was "ê "a 
significantly higher than the newcomer © 
federate (p < .01; CE, DAB, error term): ra 
With regard to the results of the ana Yu 
involving communication direction, the 19", 
action effect of open-closed conditions an nus 
rated confederates was found to be signifi ive 
(P < .05). More notes were sent by the 00 
members to the newcomer in the open co ial 
tion; more notes were sent to the reg" 
member in the closed situation. ol 
The analyses of the objective measure yp 
influence involving the accuracy of g" 


Newcomer in Open and Closed Groups 57 


solutions did not yield significant results. 
However, the results involving the subjective 
measure of influence or estimated change by 
the naive group members were statistically 
significant (p < .01). Those groups in which 
the regular member possessed the correct 
solution reported less change than those where 
the newcomer was correct (p < .001). 

With regard to group satisfaction greater 
private agreement with the group answer was 
revealed under the open conditions as opposed 
to the closed conditions (p < .01; CE, DAB, 
error term). 


DISCUSSION 


At a very general level of abstraction, the 
results indicate that members of open groups 
react in a very different way than members 
of closed groups to the other members of the 
group and particularly to the newcomer. This 
rather general conclusion assumes greater 
trenchancy when it is recalled that the open 
groups differed from the closed groups only 
in the fact that the open groups had been 
informed at the time of their inception of 
some impending membership changes. 

Thus, in the open groups which were pre- 
sumed to be more task oriented and less 
interpersonally oriented, the members with 
the most superior methods of approaching the 
task and the correct answer (the newcomer as 
well as the regular confederate) were more 
accepted by the members of open groups than 
by closed groups. 


However, a furt 
the newcomer and th 


her differentiation between 
e regular members comes 


to light with regard to the number of Y 
received by these respective aren : 
terms of assimilation, the results indicate e 
the newcomer is admitted and sate » 
greater facility by the open groups than by 
the roups. - 
ipee a that under open vei 
Or conditions of social metabolism where e 
elements of a complex are temporarily iter 
related (due to the anticipated sepsis 
removal, or addition of members), consta : 
Adjustment to the specific a D 
the changing membership as well as to = 
changing roles and status hierarchies is P P 
ceived by the members as dysfunctional. It i 
Tecognized that energy is only dissipated in 


an endless effort to establish transitory inter- 
personal relationships. 

In open groups in contrast to closed groups, 
measured performance of the group as a 
whole offers a more durable basis for group 
structure. Accordingly, open groups are less 
concerned with who actually performs the 
task. Task accomplishment is a sufficient con- 
dition for member satisfaction (Cartwright & 
Zander, 1953, p. 310). 

In the event of a newcomer, a redistribu- 
tion of prestige symbols and social cathexis is 
less necessary in open groups. In addition, the 
newcomer may be perceived by the members 
of open groups as a resource for increasing 
the probability of group success rather than 
as a restrictive and/or disruptive supernumer- 
ary threatening to attenuate the shares of the 
limited interpersonal or prestige rewards 
within the group. Or perhaps very simply, the 
newcomer enters a far more complex, inter- 
related system of elements in closed groups 
than in open groups, and assimilation is ac- 
cordingly accomplished with diminished facil- 
ity and increased hostility—at least during 


_ the initial phases of the assimilation process. 


It is compelling to speculate briefly con- 
cerning the implications for small group re- 
search and social science in general of these 
results involving the open-closed group con- 
cepts. It is all too clear that small group 
experiments have been concerned almost ex- 
clusively with processes and products of 
closed-reaction systems, systems in which the 
forces tend to move toward a state of equilib- 
rium (see Heinicke & Bales, 1953). However, 
this traditional closed model is scarcely iso- 
morphic with regard to social systems which 
also concern the minatory aspects of illness, 
death, and change of group composition in 
general. 

A basic characteristic of our social systems 
is that they maintain themselves in a state of 
perpetual change of components. It is true 
that within a given time span the composition 
of the social system appears static. However, 
the perception is nonveridical We are de- 
ceived by our own parallel movement along 
the flow of time. Moreover, perpetual changes 
involving the components of the system are 
obscured by the relatively “steady state” of 
the system of the molar level. 


58 R. C. Ziller, R. D. Behringer, and M. J. Jansen 


The relatively slow rate of change in com- 
position of most social groups may also tend 
to obscure the phenomenon. If the rate of 
change of elements in a social group were to 
be markedly increased however, the conse- 
quences would not only become manifest but 
obtrusive if not chaotic. Perhaps, in fact, 
there exists an optimum rate of change for 
various objectives of the group. 

A second aspect of the experiment concerns 
the naive members! reactions to the newcomer 
as compared with the core confederate. In 
this respect, the “oldtimer” was rated higher 
under all conditions with regard to “quality 
of ideas," "pleasantness," and “influence” 
(see Table 1). Thus, even for a relatively 
short group “history,” the regular members 
appear to react more positively to the longer 


term member. Tenure and status appear to be 
positively related, 


SUMMARY 


The dimension of group stability was in- 
vestigated in a laboratory setting. The assim- 
ilation process of a new member was studied 


in open groups (groups which had been ap- 
prized of various 


changes) as compar 


(groups which did not anticipate membership 


designed to ex- 


rom the outset, 
group after the 
decision-making 
roup, either the 
derate held the 
method for arriv- 

- The group interacted 
through the medium of a Standard note- 
Under the o 


Pen conditions in comparis 
à on 
with the closed conditions the naive members 


were more positively disposed towards da 
confederates, rated the ideas of both cong 
federates higher, and were more satisfied M 
an objective criterion) privately with a 
group solution. Moreover, under the opa 
conditions, more notes were remitted to th 
newcomer confederate than to the regulat 
confederate; while under the closed condi- 
tions, more notes were submitted to h 
regular confederate than to the nes 
confederate. These results were interpreted 
indicating that open groups tend to be 4 
interpersonally oriented but more task 9 
iented than closed groups. a 
Finally the regular members tended d 
react more positively to the longer i. 
member; that is, tenure and status W 


found to be positively associated. | 
| 
REFERENCES m 
BECKER, H. Through values to social interpretati? 
Durham, N. C.: Duke Univer. Press, 1950. dy 
Cartwricut, D., & ZANDER, A. (Eds.) Grout cl 
namics research and theory, Evanston, Tl.: 
Peterson, 1953, york 
FEDERER, W. T. Experimental design. New 
Macmillan, 1955. qion 
Hemer, F. The psychology of interpersonal rela 
New York: Wiley, 1958. rent 
Hemnıicke, C, & Bates, R. F. Developmental ios 
in the structure of small groups. Sociometrys 
16, 7-38. E 
HrwPnmr, J. K. Situational factors in leader aft 
Ohio State U. Stud., Bur, Educ. Res, Mo" 
1949, No. 32. «lof 
von Wisse, L, & Brecker, H. Systematic E dé 
On the basis of the Bezichungslehne and Cei 
lehne of Leopold von Wiese. New York: | 
1932. m 
Worrr, K. H. (Ed.) The sociology of Georg Sit! 
Glencoe, Ill.: Free Press, 1950. asi? 
Zier, R. C, & Benriycer, R. D. Group persu’ ga 
by the most knowledgeable member under €? J^ 
tions of incubation and varying group size. J. 
Psychol., 1959, 43, 402-406. 


(Received April 9, 1960) 


Journal oj Applied Psychology 
1961, Vol. 45, No. 1, 59-62 


THE PREDICTION OF RESEARCH COMPETENCE 
AND CREATIVITY FROM PERSONAL HISTORY 


WALLACE J. SMITH, LEWIS E. ALBRIGHT, J. R. GLENNON 
Standard Oil Company (Indiana) 


AND 
WILLIAM A. OWENS 


Purdue University 


Traditionally, two different research de- 
signs have been employed by investigators 1n 
their use of biographical data. These are 
directly analogous to the present-employee 
and follow-up methods of test validation. 
Exemplifying the latter are “weighted appli- 
cation blank” studies such as those of Kirch- 
ner and Dunnette (1957), Minor (1958), 
and Naylor and Vincent (1959). In this 
approach, the pool of items for validation 
consists of those that appear on the em- 
ployer's application form, filled out at the 
time of hiring by the criterion groups. Typ- 
ically, the criterion used is job tenure, so that 
items are retained which discriminate the long 


and short tenure individuals. A validity co- 


efficient based on the composite of surviving 
lidity estimate. 


items would be a predictive va s : 
The nature of the items retained, using this 
design, is almost always strictly demographic 
(age, marital status; number of dependents, 
etc.). 

In the o 
“personal history 
covering various b 
istered to presently emp. 
have been segregated int 
contrast to the weighte 


design, a diversity of criterion measures has 


[1j 
been used here; for example, the “human 


relations” effectiveness of industrial supervi- 
Sors (Edgerton, Feinberg, & 'Thomson, ees 
ross earnings of automobile salesmen ( ^ 

Nedy, 1958), and creativity of dn. 
Signers (Owens, Schumacher, & Clark, 1957). 
Validity coefficients $ these studies are um 
Current validity estimates. Whereas, E E 
Weighted application plank is generally lim 
ited to factual items, the personal history 
Questionnaire can include additionally items 


ther design, which we will term the 
H approach, an instrument 
ackground topics is admin- 
loyed individuals who 
o criterion groups. In 
d application blank 


covering preferences, attitudes, and interpre- 
tation of experience (Rundquist, 1950). 

The present study involves the attempt to 
validate an extensive personal history form 
with petroleum research scientists, using three 
different criteria of success in research work. 


METHOD 


Subjects. The Ss for this study were volunteer 
male employees at the Standard Oil Company’s 
largest research laboratory. All of them were college 
graduates, the majority possessing advanced degrees, 
The most common areas of educational specialization 
were chemistry and chemical engineering with a few 
individuals having degrees in other technical areas 


. such as mathematics, physics, etc. The length of serv- 


ice of the group ranged from less than 1 year to over 
20 years, the heaviest single concentration (30%) 
occurring in the 1- to 5-year interval. Seventy-six 
percent were less than 40 years of age. The number 
of Ss varied from 331 to 198 because of missing 
predictor or criterion data. 

Predictor. The personal history questionnaire used 
was tailor-made for this study and consisted of 484 
multiple-choice items. The items were assembled 
from various sources. The content of the question- 
naire included home and family background, various 
aspects of present and previous jobs, athletic inter- 
ests, school and college activities, etc. 

Criteria. The criterion which was used initially in 
the study was a rating of the researchers on over-all 
job performance made by their supervisors. These 
ratings are made periodically for administrative pur- 
poses; they were not collected specifically for this 
investigation. The rating scheme used is a seven-step 
forced-distribution system, such that ratings of 1 and 
7 represent the high and low ends of the distribution, 
respectively. The percentages of Ss at each point on 
the distribution are as follows: 1, 1076; 2, 1596; 3 
15%; 4, 20%; 5, 15%; 6, 15%; 7, 10%. For pur- 
poses of item analysis, the 20% with ratings of 4 
were eliminated, leaving the high and low 40% as 
criterion groups. Some evidence for the reliability of 
this criterion is the fact that any man’s rating is the 
product of the judgment of several different super- 
visors. Also, it is reported that relatively little shift 
in rating occurs from year to year for most people. 


60 Smith, Albright, Glennon, and Owens 


Some time after the project was undemvay, it be- 
came possible to collect ratings regarding the creativ- 
ity of the Ss. The rating instrument used was the 
Check-List Rating Scale for Creativity: Form C-1, as 
developed by Taylor (1958). This is a Thurstone- 
type scale of 24 descriptive items. The rater checks 
only those items that apply to or describe the ratee. 
The statements range in favorableness from “He 
shows signs of being one of the most creative men 
in this work that I have known" to *He never has 
an idea of his own to suggest." The reliability of the 
scale is suggested by Taylor's correlations, which 
averaged .83, between the creativity checklist ratings 
and descriptive ratings of the same individuals made 
4 to 5 months later on "originality." In accordance 
with Jurgensen's rationale (1949), the form was 
scored by computing the algebraic sum of the scale 
values instead of using the median scale value, as 
Taylor did. On this basis, individuals with checklist 
Scores of —160 to —10 made up the low criterion 
group and those whose scores ranged from +50 to 
+180, the high group. , 

The third criterion consisted of the number of 
patent disclosures filed by each man during the 5- 
year period 1954 to 1958 inclusive. (A patent dis- 
closure is a document written by the technical man 
describing an idea which he regards as patentable. 


Before credit is received for the disclosure, it must 
be screened by an in- 


mittee; after further 
become a formal pat 
span was chosen 
bution of disclosu: 
included sufficien 


eight or more as 

Procedure. Bec: 
history question: 
five “units” of a 


oluntary nature of the Project, 
normal turnover, travel schedules, etc, some attrition 
in numbers of Ss took place during the study. 

For the item analysis, all criterion groups were 
subdivided randomly into validation and cross- 
validation Samples. The numbers of Ss in each are 


al grounds than on the 
ty. 

i grateful to the Administrative 
Division at the Research Laboratory, especially H, R 
Ahlberg, for his assistance in the data collection, as 
well as for other help rendered during the study. 


?'The writers are 


given in Table A. Ns ranged from 25 to 68, but 
were approximately equal for high and low samples 
for a given criterion and content unit. Using a pro- 
gram developed for the IBM 705, the significance of 
the difference in response percentages of the high and 
low criterion groups was computed for every re- 
sponse, validation and cross-validation samples sepa- | 
rately. A scoring key was developed for each crite- 
rion from those items which discriminated in both 
the validation and cross-validation samples at or 
beyond the .10 level, making a compound probally 
requirement of .05 or less. The keyed items es 
given weights of --1 or —1 in accordance with t 
direction oí discrimination. J 
The concurrent validity of these keys was E 
mated by scoring the papers of 100 Ss polen 
randomly from the pool of Ss having complete ee 
sonal history and criterion data. Some of the em 
this "scoring sample" were middle criterion noy d 
uals never used in the item analyses, but others aa 
been in the keying samples. An empirical siue 
(Albright, Smith, Glennon, & Owens, in press) e 
pared the validity estimate obtained by this zd 
parently contaminated method with those obtai aL 
using the conventional double cross-validation p ; 
cedure (Katzell, 1951). Contrary to expectation, ~ 
“contaminated” r was very close to the larger of iter 
two sample z's, which Katzell proposed as the be 
estimate of the true validity of the composite. 


RESULTS 


Representativeness of subjects. The fact 
that the Ss were volunteers raised a questia 
as to their representativeness of the ent 
professional staff in the laboratory. A ed 
of representativeness was made by testing fi 
distribution of obtained over-all performan 
ratings against that expected under 
forced-distribution system. This was done ve 
all five questionnaire units. None of the M 
values of y? was significant at the 5% leV€ 
indicating that the sample was not disprop? 
tionately weighted with Ss from one or ? de 
other rating category. This test was also m4 A 
for the 100 randomly selected Ss in the sc? 
ing sample, with nonsignificant results. 


of 


mod 
3A I-page table showing the number of subje y 
for each of three criteria by personal history unit ie? 
validation and cross-validation samples, and by er 
and low ratings has been deposited with the AT Sol 
ican Documentation Institute. Order Document 77 
6548 from ADI Auxiliary Publications Project, ing 
toduplication Service, Library of Congress; Washing. 
ton 25, D. C., remitting in advance $1.25 for mice 
film or $1.25 for photocopies. Make checks pay? 


to: Chief, Photoduplication Service, Library of e 
gress. 


n 


Prediction from Personal History 61 


Tntercorrelations of criteria. Table 1 pre- 
sents the intercorrelations of the three crite- 
tia. These values are based on the total num- 
bers of Ss for whom the criterion data were 
available, regardless of whether they filled out 
the personal history questionnaire. As might 
be expected, the two sets of ratings correlate 
more highly with each other than either does 
with the patent disclosure criterion. However, 
all three of the coefficients shown are signifi- 
cant at the .01 level. 

Validity and reliability of scoring keys. 
When the three sets of item weights were 
applied to the papers of the scoring sample, 
the results shown in Table 2 were obtained. 
As can be seen, the concurrent validity of the 
37-item over-all performance key was .613. 
Correspondingly, for the creativity and pat- 
ent disclosure criteria the validities are .521 
and .517, respectively, both based on 22 items. 
All of these coefficients far exceed the value 
required for significance at the .01 level. 

The number of items shown for each key 
may be somewhat misleading because the 
same item may discriminate on more than one 
criterion (although the same response is not 
necessarily scored each time). In fact, three 
items are included in all three keys. A total 
of 59 different items are scored. 

An estimate of the reliability of some of 
these scored items was obtained when 25 re- 
searchers consented to retake two of the per- 
sonal history units about 2 months after their 
Original responses had been submitted. These 
two units contained 29 of the 59 discriminat- 
ing items. The correlation of the 29-item 
Scores obtained on the two administrations 
Was .683, This value is similar in magnitude 
to those reported by other investigators 


TABLE 1 
INTERCORRELATIONS OF CRITERIA 
2 3 
534 487 
Over-all Perform- a . ] 
pie inim (N = 362) (N = 285) 
-200 
Creativity FM 
Rating (N = 23 
Patent 
Disclosures 


TABLE 2 


RESULTS OBTAINED BY APPLYING Key ron Eacu 
CRITERION TO SCORING SAMPLE 


(N = 100) 
Over-all 
Performance Creativity Patent Dis- 
Key Key closure Key 
N of Items 37 22 22 a 
Mean —.30 —.55 3.20 
SD 5.92 4.37 3.75 
r -613 | -521 517 


(Fiske, 1947; Guilford, 1950) with very large 
numbers of Ss. 


Discussion 


The relatively high validities of the three 
scoring keys demonstrate the utility which the 
personal history technique can have with 
highly skilled individuals. (It is acknowledged 
that these coefficients are doubtless somewhat 
higher than those which would be obtained 
with an entirely new sample of Ss; but, as 
stated previously, the writers’ earlier study, 
in press, suggested that the amount of infia- 
tion was insignificant.) As is often noted, 
aptitude and other types of tests which work 
well at lower ability levels may fail to dis- 
criminate in groups of this kind. Also, it 
would be difficult to find another instrument 
which would yield a better “rate of return" 
in predictive effectiveness per unit of testing 
time. For example, most applicants can easily 
complete the 59 keyed items in 30 minutes, ` 

The only other studies in this general area 
of which the writers are aware are those by 
Mandell (1950), Stein (1957), and Taylor 
(1958). Mandell, using a 37-item inventory, 
was unable to discriminate chemists rated 
high and low on job performance by their 
superiors and colleagues. Taylor found that 
20 out of 47 personal history items were sig- 
nificantly related to either or both of two cr 
teria for 94 electronics engineers and scien- 
tists. The criteria were ratings of productivity 
and creativity. However, when he extended 
the scoring keys to a group of Physicists, only 
chance relationships obtained, Stein reports 
a number of biographical variables which dif- 
ferentiate significantly between his groups of 


62 Smith, Albright, Glennon, and Owens 


“more” and “less” creative industrial research 
chemists. There were 23 individuals in each 
group, the groups being determined by rank- 
ings and ratings from superiors, colleagues, 
and subordinates. No cross-validation data are 
reported. 

Although some of our “personal history” 
items are similar in nature to those found in 
tests of personality, interests, and values, it 
is felt that they engender less hostility on the 
part of the applicant and are less subject to 
faking than is charactéristic of many such 
tests. For one thing, they are presented in 
context with other items which do require 
responses of a strictly factual nature. Also, 
the applicant may do himself as much harm 
as good by trying to appear the “Organization 
Man,” due to the empirical weighting pro- 
cedure used. 

A review of the discriminating items sug- 
gests that a “self-confidence” factor may be 
what these items assess. This interpretation is 
reinforced by the frequency with which the 
high criterion groups say that they (a) have 
more readily taken advantage of opportunities 
presented them, (b) consider their achieve- 
ments thus far to be greater than those of 
others with the same education, (c) work 
more quickly than others, and (d) prefer to 
have many things *on the fire" simultane- 
ously. Also the high groups tend to have more 
education than their colleagues, have obtained 
it on scholarships or fellowships, have worked 
as teachers or instructors, have published at 
least one technical paper and devote much 
time to reading—suggesting the presence ad- 
ditionally of an *academic orientation" factor. 


SUMMARY 


A personal history questionnaire was ad- 
ministered to a group of petroleum research 
scientists employed in a research laboratory 
of the Standard Oil Company. The question- 
naire items were analyzed and cross-validated 
„against three criteria of research effective- 
ness: ratings of over-all performance, ratings 
specifically on creativity (both made by su- 
pervisors of the researchers), and objective 
records of the number of patent disclosures 
produced by each man during a 5-year period. 


Three keys, totaling 59 items, were applied 
to a scoring sample of 100 randomly selected 
Ss. The concurrent validity estimates resulting 
were .613, .521, and .517 for over-all perform- 
ance ratings, creativity ratings, and patent 
disclosures, respectively. These results were 
interpreted as indicative of the value of the 
personal history technique with a highly select 
group. 


REFERENCES 


ALBRIGHT, L. E, SMITH, W. J., GLENNON, JR,& 
Owens, W. A. Estimating validity in double cross- 
validation of item analyses. Engng. industr. Psy- 
chol., in press. 

Epcerton, H. A, FrrNBERG, M. R, & THOMSON, 
K. F. Prediction of the “human relations" effective- 
ness of industrial supervisors. Personnel Psychol, 
1957, 10, 421-431. Mw P 

Fiske, D. W. Validation of naval aviation pos. 
selection tests against training criteria. J. appt- 
Psychol., 1947, 31, 601-614. 

Guirronp, J. P. Fundamental statistics in psychology 
and education. (2nd ed.) New York: McGraw-Hill, 
1950. j le 

Jurcensen, C. E. A fallacy in the use of median x 
values in employee check lists. J. appl. Psychol, 
1949, 33, 56-58. d 

KarzELL, R. A. Cross-validation of item analyses: 
Educ. psychol. Measmt., 1951, 11, 16-22. ifc 

Kennepy, J. E. A general device versus more Spec 
devices for selecting car salesmen. J. appl. Psycholn 
1958, 42, 206-209. NW. 

Krrcuner, W. K., & Dunnette, M. D. Applying c^ 
weighted application blank to a variety of O 
jobs. J. appl. Psychol., 1957, 41, 206-208. a 

Manpett, M. Selecting chemists for the federal go 
ernment. Personnel Psychol., 1950, 3, 53-56. ial 

Mrwor, F. J. The prediction of turnover of cerie 
employees. Personnel Psychol, 1958, 11, 393-4 je 

Navrom, J. C, & Vincent, N. L. Predicting fem? 
absenteeism. Personnel Psychol., 1959, 12, 81-84- 

Owens, W. A., SCHUMACHER, C. F., & CLARK, J+ i 
The measurement of creativity in machine desig 
J. appl. Psychol., 1957, 41, 297-302. ate 

RuwpQursr, E. A. Personality tests and predic?” 
In D. H. Fryer & E. R. Henry (Eds.), Handb? E 
of applied psychology. Vol. 1. New York: RIP 
hart, 1950. Pp. 182-191. ad 

Stem, M. I. Creativity and the scientist. Peper Ie 
at National Physical Laboratory, London, inal 

Taytor, D. W. Variables related to creativity M 
productivity among men in two research labore 
tories. In C. W. Taylor (Ed.), The second Uni? (f. 
sity of Utah research conference on the identific è 
tion of creative scientific talent, 1957. Salt La 
City: Univer. Utah Press, 1958. Pp. 20-54. 


(Received April 21, 1960) 


A complete library 
of psychology in 


one volume — 


CLASSICS IN 
PSYCHOLOGY 


Edited by Thorne Shipley 


In a text of over 1400 pages, are the works 
of 35 leading psychologists of Europe, the 
United States, and the Soviet Union, span- 
ning a period of one hundred and fifty 
years. 
Some of the chaplers: 
PHILIPPE PINEL: 
‘Treatise on Insanity 
JEAN MARTIN CHARCOT: 
Diseases of the Nervous System 
WILHELM WUND 
Sensory Perception 
m . ERNST MACH: 
The Relation of the Physical to the Psychical 
WILLIAM JAMES: 
Principles of Psychology 
EMIL KRAEPELIN: 
Clinical Psychiatry 
E. B. TITCHENER: 
Structural Psychology 
BENJAMIN RUSH: 
"The Discases of the Mind 
MANFRED SAKEL: 
Shock Treatment of Schizophrenia 
EUGEN BLEULER: 
Dementia Praecox 
ALFRED ADLER: 
Individual Psychology 
I. P. PAVLOV: 
Conditioned Reflexes 
JOHN B. WATSON: 
Behaviorist Views 
HERMANN RORSCHACH: 
Psychodiagnostics 
AUGUST AICHHORN: 
Wayward Youth 
G. STANLEY HALL: 
Adolescence 
JEAN PIAGET: 
The Language and Thought of the Child 
KURT KOFFKA: 
Perception 
ISAAC RAY: | 
Jurisprudence of Insanity 
WILLIAM McDOUGALL: 
Social Psychology 
«:REUER & SIGMUND FREUD: 
Studies of Hysteria 
C. G. JUNG: 
Analytical Psychology 
$20.00 


ni by enclosing remitlance. 


JOSEF B 


You can expedite shipmel 


PHILOSOPHICAL LIBRARY, Publishers 
15 East 40th Street, New York 16, N. Y. 


M 


READY IN JUNE 


STUDIES IN PERSONNEL AND 
INDUSTRIAL PSYCHOLOGY 


By EDWIN A. FLEISHMAN, Yale University | 


This volume offers to the student an up-to-date, well-rounded, and integrated M 
of the field of personnel and industrial psychology. Each section of the book 

organized around the key problems and issues in selection, performance apprais? ! 
training, motivation and morale, leadership, organization, accident prevention, WO" 
conditions, and engineering Psychology. Since the articles were selected vith g 
view to cover adequately the key problem areas, the text achieves an outstandin 
balance of coverage in each major section. Each of the nine sections is introduc?” 
with original text material designed to review the subject matter. A sufficient num 


ber of articles are available to enable the instructor to select for his particular € ? 
needs. 


pY 
WRITE FOR AN EXAMINATION CO 


THE DORSEY PRESS, INC., HOMEWOOD, ILLINOIS 


Journal of Applied Psychology - 


i Vou. 45, No. 2 


APRIL 1961 


JOB SATISFACTION, 


RAYMOND A. KATZELL, RICHARD S. 


JOB PERFORMANCE, 


AND SITUATIONAL CHARACTERISTICS 


BARRETT, anp TREADWAY C. PARKER 


| Research Center for Industrial Behavior, New York University 


Previous research has yielded inconsistent 
results on the relationship between job satis- 
faction and job performance of industrial 
workers. Recent reviews of research on this 
topic have identified studies in which no sig- 
nificant relationship was apparent, others re- 
porting positive correlations, and even some 
in which an inverse relationship was found 

© (Brayfield & Crockett, 1955; Herzberg, 
Mausner, Peterson, & Capwell, 1957). 
Attempts to interpret these facts have led 
to theoretical statements which postulate the 
influence of a number of additional variables 
on the relationship between satis 
Performance (Brayfield & Crockett, 1955; 
March & Simon, 1958; Morse, 1953). Among 
these are the motivations, expectations, and 
aspirations of the workers and the rewards 
obtainable through the various modes of be- 
_ havior possible in the work situation. 
Our review of extant research and theory 
| 
| 


9n the subject has led us to a general model 
in which the work situation is regarded as a 
System having as separate outputs employee 


Job satisfaction and performance, and as in- 
working envi- 


Various of the 
ffect either or 
via their 


Duts characteristics both of the 
třonment and of the employees. 
‘puts may be expected to a 
both of the two sets of outputs 
Sects on employee motivation, 
Oth. Furthermore, the inputs may be inter- 
, Stive in their effects. It seems 1 
I t odel can accommodate the diverse findings 
Sarding the relationship between job satis- 
n this paper was per- 
McKesson and Robbins, 


ss their appreciation 
in furnishing the 


1 " 
fo, The research reported 
È es under contract with 
fo. The authors wish to expre 
the cooperation of the company 


ES 
* and financial support. 


faction and * 


faction and performance in various situations. 

However, most of the research on which 
present information on the subject is based 
has not fulfilled the requirements of such a 
model. Previous research has typically in- 
volved relatively simple designs in which 
characteristics of employees or the work envi- 
ronment have been correlated with job satis- 
faction or performance, or in which satisfac- 
tion and performance have been correlated. 
Katzell (1957) has already pointed out that 
not much can be learned about the relation- 
ships between employee attitudes and per- 
formance from simple two-variable research 
designs of this type. 

'The detailing of the general model outlined 
above would entail research which examines 
simultaneously data on various inputs and 
outputs. The accumulation of such informa- 
tion should eventually lead to specification of 
the input conditions under which particular 
aspects of satisfaction and performance are 
positively correlated, negatively correlated, or 
uncorrelated. 

The investigation reported here lies within 
the framework of this model by undertaking 
an analysis of employee satisfactions and per- 
formances in relation to certain character- 
istics of the work situation. We were fortunate 
to encounter an industrial setting which not 
only enabled the collection of these kinds of 
data, but which also presented several addi- 
tional advantages. There was a sizeable num- , 
ber of work groups all performing essentially 
the same work with the same methods. Com- 
mensurable data were available for these 
groups on several objective measures of job 
performance. In spite of these uniformities, 
there were appreciable differences among the 


65 


66 R. A. Katzell, R. S. Barrett, and T. C. Parker 


groups in several aspects of their work situa- 
tions, including group size, ratio between male 
and female employees, wage rates, whether or 
not the group was unionized, and the size of 
the city in which the organization was located. 

In brief, then, it was the object of this 
investigation to illuminate the relationship 
between job satisfaction and performance, 
through a research design that conceived of 
these variables as outputs of a system which 


had as inputs characteristics of the work 
situation. 


METHOD 
Research Setting 


The study was conducted in an industrial company 
which had 72 wholesale warehousing divisions con- 
cerned with the storage and distribution of drug and 
pharmaceutical products. The divisions were geo- 
graphically decentralized throughout the United 
States, but operated under standard work methods 
and procedures that had been installed on a com- 
pany-wide basis. The key production personnel of 
the warehouses were order-pickers, who filled orders 
from stock on the basis of a form detailing the items 
and their quantities as ordered by each customer. 
This situation lent itself to objective measurement of 
the quantity, quality, and profitability of production 
in each division, by measures to be described below. 
The warehouses had a mean of 35 production work- 
ers each, with a range from 13 to 83. The employees 


of 40 of the warehouses were represented by local 
unions, 


Attitude Survey 


In 1956, an attitude survey was conducted by 
Kolstad Associates among the employees of these 
warehouses. A questionnaire was administered to all 
employees (other than absentees), under supervision 
and conditions of anonymity. It contained 47 mul- 
tiple-choice items concerning the employees! job 


satisfactions. A content analysis of the questions 
yielded the following categories of Coverage: nature 
of the job, job performance, pay, benefits, teamwork, 
immediate supervision, management, promotion op- 
portunities, Working conditions, and communica- 
tions. It may be noted that this coverage, and the 


» Were later made avail- 


able to the present investigators for research 


purposes, 


Performance Measures 


The following measures were selected jointly by 
the investigators and company management as repre- 
senting meaningful and objective indicators of per- 
formance, deemed to be largely dependent on the 


‘ance are those relating to the motivations of 


behavior of line employees in the warchouse. rable 
be comparable among the warehouses. Each 1 used 
was computed as an aggregate figure for eachty 29 
house as a whole, for the 1956 calendar Year egory 
Quantity = number of products processed ir ae 
orders per man-hour of production . 10 
Quality = number of errors in filling order 3 
100 man-hours of work an E 
Profitability = net profit realized from operation 
of the division as a ratio to total dollars of sales; 
this figure is affected not only by warehouse Pe 
formance but also by variations in gross margins 
wage rates, occupancy costs, etc. " al 
Product-value productivity = sales dollar value pi 
man-hour of production b. 
Turnover = additions to work force per quat 
expressed as a percentage of total number employ? it 
this is an imperfect measure of turnover, Since ie 
reflects expansion as well as replacement and ma E 
no distinctions as to the reasons for terminatio? 


Situational Characteristics 


1 
It will be recalled that our theoretical E 
includes input variables which may be expecte i 
affect cither or both employee satisfaction and Min 
formance. The selection of such variables for ber 
gation, if it is not to be done on purely pragm oí 
grounds, requires looking into the “black box m 
the system and envisioning its dynamic proper 
Among those properties which may be conceive ro 
relevant both to employee satisfaction and perfo 


e 


employees. Situational characteristics bearing the 
these motivations might include those reflecting — 
needs and expectations of the employee sample otk 
those describing the incentives prevalent in the W ac 
ing environment. Limited by the after-the e 
nature of the study, we therefore sought in aval an 
records measures that could be regarded as releva 
to such motivational considerations. The follow 
were employed in the study: hous? 
Size of work-force = average number of wareho™ 
employees over the course of a year m 
City size = population of city in which wareh 
was located essed 
Wage rate = total straight-time earnings exp" "i 
às a ratio to total straight-time man-hours OE 5 
Unionization — whether or not warehouse 
ployees were represented by a union 
Percentage male — percentage of warehouse 
ployees who were men , 
Descriptive information was also compiled m 
company policies and on social and technolog! à 
features of the warehouses, as supplementary 
regarding characteristics of the work situation. 


ew 


on 


1 
RESULTS 


7 

Various correlational analyses were Pe 
formed within and between the data on J g 
satisfaction, job performance, and situatio” 
variables, as specified below. 


Job Satisfaction and Situation Characteristics 


-+ 


erformance and Situational Variables 


duct-moment correlations are reported 
‘ole 1 among the five job performance 

= ve situational variables noted above. 
antity, Profitabiilty, and Product-value 
ctivity were highly  intercorrelated. 
Product-value productivity was so highly 
related to the other two that we eliminated 
it from subsequent analyses, particularly since 
Quantity and Profitability are of more basic 
interest both to management and the psy- 
Chologist. Quality and Turnover were both 
independent of all the other performance 

measures. ] 
By breaking down the annual data into 
their component four quarters, it was possible 
to estim ^w the reliability of three of these 
depc— — bariables using Horst's (1949) 
Met. Jte resulting reliability coefficients 
Weres, oPttity, 0.87; Quality, 0.83; Turn- 
Ove, Q.€ | Whereas the reliabilities of the 
first two variables are satisfactory, the Turn- 
ather unstable. This may 


Over measure is t: 
help account for its unrelatedness not only to 
but also to 


the other performance measures, DU 
the situational and satisfaction variables to 
e discussed below. 
The five situation 
Siderably intercorrelated. 
tionships are not surprising: 


al variables were con- 
Some of these rela- 
larger ware- 


67 


houses are needed in larger communities; the 
larger warehouses are also those more likely 
to be unionized, since urban culture is com- 
mon to both characteristics. The linkage 
among Percentage of male employees, Unioni- 
zation, and Wage rate, suggests some sort of 
mutual accommodation or integration among 
personal and plant characteristics; perhaps 
this is the result of such processes as self-se- 
lection of employees or necessary adjustments 
in plant policies. 

Many significant correlations emerged from 
the comparison of performance measures with 
situational variables, in line with our expecta- 
tions. In general, Quantity, Product-value 
productivity, and Profitability were found to 
be lower in warehouses which pay higher wage 
rates, are larger in size and located in larger 
communities, are unionized, and have a larger 
proportion of male employees. 

A centroid factor analysis was performed 
on these data to help reveal more clearly the 
structure of the relationships. Two centroids 
emerged which upon rotation were reduced to 
a single common factor. The results appear 
in Table 2, and underscore the relationships 
noted in the preceding paragraph. 

The pattern of factor loadings of the situa- 
tional characteristics suggests that the factor 
represents the degree of urbanization of the 


TABLE 1 
iG PERFORMANCE AND SITUATIONAL VARIABLES 
F RRELATIONS AMONG 
INTERCO W = 72) 
Correlation Coefficients 
i " Performance Variables Situational Variables 
Variable Mean 
3 4 5 6 7 8 
1 Quantity 22.8 x a À s 
3 Quality 8.3 a 36 00 
Profitability 5.5 " 59 00 56 
Product-value 93.5 1 _it 15 03  —01 
Turnover 10.5 6.6 . 
6 Si 3 -24 -34 -0 
sake si 165 T% b Is -2 -06 6 
: - 7 —22 - 
g City size 323 a La -15 40 05 - s: a 
) 428€ rate 1.48 ; ge 06 —42 —27 -18 5 66 
10 p ™onization i 3 21 -04 —34 o —14 02 09 48 33 
rcentage T 70.2 17. 
story pmi Luise; correlations with this variable are pame ülseriali with. union préseas coded 
as y 10, . were unii i 


lang for 55%) of the 72 warehouses 
Sence as 0. 


68 R. A. Katzell, R. S. Barrett, and T. C. Parker 


TABLE 2 


ROTATED CENTROiD Factor LoADINGS OF 
SITUATIONAL AND PERFORMANCE VARIABLES 


(N = 72) 
Factor 
Variable Loadings 

Quantity i32 
Quality 08 
Profitability 64 
Turnover —.32 
Size of work-force —.80 
City size —.74 
Wage rate —.58 
Unionization —.64 
Percentage male —.38 


setting in which the division is located: low 
urbanization being represented by relatively 
small community population, few employees 
in the division, lower wages, absence of a 
union, and lower proportion of male em- 
ployees. Associated with this urbanization 
syndrome is the adequacy of warehouse per- 
formance. Warehouses characterized by rela- 
tively little urbanization were likely to show 
superior performance (especially financially), 
including some tendency toward low turn- 


over; however, quality of production is 
independent of this factor. 


Relations to Job Satisfaction 


Relationships between the attitude ques- 
tionnaire data and the performance and situa- 
tional variables were analyzed on an item-by- 
item basis. This was done partly because of 
Our interest in specific relationships and 
partly because we were not certain of the 
grouping of questionnaire items into an over- 
all attitude scale or subscales. 

The item score used in this analysis for 


each warehouse was computed in the follow- 
ing way. All items contained subjectively 


scaled response alternatives running, for ex- 


ample, from “T like my job very much” to 
“I dislike it very much” A cutting point was 
fixed for each item on an a Priori basis which 


was intended to divide the distribution of 
responses of all employees as nearly as pos- 


sible into favorable and unfavorable halves, 
The percentage of employees in a given ware- 


house whose responses fell in the e 
part of the distribution of an item was use 
as the item score for that warehouse. In 29 
of the 47 items, the “favorable” category 
turned out empirically to contain Mic 
5095 and 7996 of all respondents; in is 
items, this percentage was less than 50, ara 
in 8 items it was 80 or more. Varianti 
among the warehouses in the percentages © 
favorable responses to items was adequate; 
the standard deviations of item scores ranging 
from 6.8 to 21.5 with a median of 13.7. 

(It may be of interest to the reader a 
know the level and variability of aggrezan 
job satisfaction scores, although these M. 
not used in the correlation analysis Fe 
reasons previously stated. Such an agerem 
score was computed for each war ia 
averaging the percentages of favGical m 
sponses to all 47 items. The distripxpect+! p 
these scores among the warehotseeawa: M 
proximately normal, and ranged from “A 
83. The mean of this distribution was 9^ 
and the standard deviation was 8.7.) E. 

The product-moment correlation be 
each of the 47 questionnaire items and. C. 
of the performance and situational varia? 


satisfaction, as represented in the item E. 
sponse, is directly associated with hig E 
scores in the performance or situational V4 
able; a negative correlation expresses á 
inverse relationship between satisfaction 2 
the other variable. 

Table 3 reveals that, on the whole; 
satisfactions were positively associated, 
yond chance expectancy, with two aspere y: 
performance, Quantity and  Profitabi " 
There was no relationship between M. 
satisfaction and either Quality or Turno" 

It is also clear that job satisfactions W 4 
associated with situational characteristics |, 
the division. There was typically higher Dr E 
satisfaction in situations that had the w 
marks of small town culture than in ort 
with urban characteristics, i.e., having Dt 
employees, a large city location, higher be 
union representation, and proportion@ 
more male employees. : ip 

Examination of the item correlations 


jo? 


Job Satisfaction and Situation Characteristics 


69 


TABLE 3 


NUMBER OF ATTITUDE ITEMS SIGNIFICANTLY CORRELATED BEYOND THE .05 LEVEL 
(r> .23), AND MEAN 7’s 


(N of items = 47) 


Number of Items Significantly 


Correlated 
Mean r 
Positive Negative (r to s trans- 
Variable Correlations Correlations formation) 
Quantity 18 0 +.21 
Quality 0 0 —.02 
Profitability 30 0 +.28 
Turnover 1 0 +.02 
Size of work-force 0 23 —2 
City size 0 34 —.22 
Wage rate 0 25 —.32 
Unionization 0 36 —.32 
0 29 ~.26 


Percentage male 


terms of content areas of the questionnaire 
revealed no consistent trend for satisfactions 
with given aspects of the job to differ from 
one another in their relationships to perform- 
ance or situational variables. In general, then, 
employees in the small town situations were 
more satisfied with various major aspects of 
their jobs, including supervision, pay, etc. . 
As an aid in comparing these results with 
those of other studies, it should be noted 
that the correlations reported in the last 
column of Table 3 are means of the 47 items, 
computed on the basis of an r to z trans- 
formation. Most other studies of the relations 
between attitudes and performance have re- 
Ported correlations based on aggregate scores 
on all the items composing à questionnaire 
Or a segment thereof. Had this procedure 
een followed in the present study, one may 
e reasonably sure that the total job satisfac- 
lon score would have shown higher correla- 
tions with Quantity and Profitability than 
ADpear in this column, because the | 
tended to have positive intercorrelations an 
Would in aggregate have constituted a more 
Teliable instrument than did a single item. 


Discussion 
The results of this investigation reveal 
senerally positive relationships between em- 
Oyee job satisfactions and performances, 
“nder the conditions studied. The question 


remains as to why, particularly since many 
of the earlier studies have not found such 
relationships. 

Part of the reason for the difference may, 
of course, reside in the previously noted 
methodological improvements, such as the 
relatively large number of groups, compara- 
bility of work and work methods, commensu- 
rable and objective performance measures, 
etc. 

However, we do not believe that this is the 
sole explanation. We are inclined, rather, to 
interpret the results in terms of our previ- 
ously stated model, which holds that the 
nature of the relationship between satisfaction 
and performance is dependent on the input 
conditions. The key to the conditions deter- 
mining the positive relationship in the present 
study lies in the situational data. These data 
show that the various groups of employees 
studied differ along an urbanization dimen- 
sion, and that the degree of urbanization is 
inversely associated both with satisfaction 
and performance. It is this last circumstance 
that may provide the clue to the positive 
correlations found between satisfaction and 
performance in this study. 

Let us therefore proceed to interpret in 
terms of the model the obtained relationships 
between the situational variables and each 
of these two sets of outputs. It will be re- 
called that the model holds that inputs may 


70 


affect satisfaction or performance through 
their influence on employee motivation, 
ability, or both. The situational variables 
studied in this investigation were selected in 
the belief that they are relevant to employee 
motivations. It developed, somewhat unex- 
pectedly, that the five situational variables 
form a pattern expressing the degree of 
urbanization of the situation. The motiva- 
tional relevance of differences along this 
dimension may be regarded as consisting of 
corresponding differences in culturally deter- 
mined needs and expectations of the em- 
ployees—the kinds of differences that may be 
anticipated in comparing a work group com- 
posed mostly of men who are residents of a 
large city, work in a large warehouse, and 
are unionized, with another group composed 
more of women, who are residents of a town, 
work in a small warehouse, and are not 
represented by a union. 

Given a fairly uniform working environ- 
ment in terms of perquisites, policies, and 
technology, as is the case among the ware- 
houses in this company, variations in satis- 
faction may stem from the differential ful- 
fillment of differing employee needs and 
expectations within this environment. When 
note is taken of such characteristics of the 
job as its relative simplicity, limited upward 
mobility, and low pay rates compared to the 
average for all industry, the inverse relation- 
Ship between the extent to which groups show 
urbàn characteristics and their job satisfac- 
tion is understandable. This interpretation is 
Supported by the finding that the urban 
groups are significantly less satisfied with 
their pay, even though it is actually some- 
What higher than that of their small town 
counterparts. 

The basis for the relationship between 
urbanization and performance 
parent from the information at hand. Again, 
our model would point to motivational dif- 


ferences among the Situations. The model 
would therefore lead us to i 


is less ap- 


nterpret the ob- 


tained findings regarding Productivity in 
terms of differences in the adequ 


acy with 
which productive behavior fulfills the needs 
and expectations of employees in urban as 
compared with nonurban situations; with the 
perquisites and rewards of employment rela- 


R.A. Katzell, R. S. Barrett, and T. C. Parker 


tively constant across divisions, differences in 
adequacy of fulfillment would be attributable 
again to differences in the culturally deter- 
mined needs and expectations of the em- 
ployees. For example, it seems reasonable that 
the more female, nonunionized, small town 
groups may be more likely to expect produc- 
tive behavior to lead to satisfaction of their 
particular needs for pay, status, and security 
than is the case in the urban groups; the mon 
male, unionized, urban groups may instea 
seek to meet their needs for higher pay 
through organized efforts. Social norms Te 
garding productivity relevant to needs E 
peer acceptance may also be quite died 
among urban and nonurban groups; in thi 
connection, it may be noted that instances S 
output restriction, such as those described bY 
Whyte (1955), have typically been reper 
in settings bearing a closer resemblance to i 
urban than to our nonurban ones. By way, 4 
further illustration, the greater satisfactio 
that the less urban groups experience towel 
their supervisors could lead such groups to P 
pect more desirable consequences of perfor! ; 
ing in accord with management's norms id 
high productivity. While speculative. the? 
considerations do seem reasonable and e 
in line with the information at our disposa = 
The interpretation of the findings in te! 
of our theoretical model may therefore ^ 
summarized as follows: Among the 72 m 
houses of this company, input differen. 
exist in situational variables affecting 
ployee needs and expectations. 'The perquis! 
and rewards of employment are essen 
similar in all warehouses, and therefore fu Br 
the needs of employees in some of the iw 
houses more adequately than they do of 
others. This accounts for the significant & t) 
relations obtained between situational (inp) 
variables and job satisfaction (outP 
variables. Furthermore, employees „havi 
different needs exhibit different levels pe 
productive performance, depending On 
perceived relevance of a given level of 1 
formance to the fulfillment of their nee a 
This accounts for the significant correlatio 
obtained between situational (input). Ma 
ables and performance (output) varjab í 
Finally, it would appear that the patterns " 
needs and expectations that are better 54 


per 


Job Satisfaction and Situation Characteristics 71 


fied through employment in this work are also 
those which are more likely to be perceived 
by employees as capable of fulfillment 
through productive performance. This cir- 
cumstance may explain the significantly 
positive correlations obtained between the 
two sets of outputs, job satisfaction and 
performance. 

It should be noted that a postive relation- 
ship between satisfaction and performance 
would be expected to appear, in terms of 
this model, only under special circumstances 
such as seem to have existed here. The rela- 
tionship would be absent were employment 
perquisites to vary with employee needs, were 
performance to be independent of employee 
needs (as in the case of paced production), 
etc. It is of interest to note that Worthy 
(1950) has qualitatively described relation- 
Ships generally similar to those observed here 
among the urbanization, morale, and perform- 
ance of various units of Sears, Roebuck; a 
review of the nature of the retailing industry 
Would suggest that it likewise affords condi- 
tions for positively correlated performance 
and satisfaction varying with employee 
motivations. 

While we believe that our model therefore 
accommodates the facts found in this study, 
We recognize that it is by no means thereby 
Proven. For one thing, the explanation intro- 
duces variables, such as needs, expectations, 
and fulfillments, on which we have only 
Scanty and indirect data. Furthermore, the 
evidence is based on correlations, which may 
Teflect coincidences rather than causality. For 
example, company management has reason to 

elieve that the smaller plants are more 

Profitable partly, at least, because they are 
Newer and because in smaller communities 

ere is greater flexibility of aparaton wr. 
More compact trading area to serve. in this 
Context, warehouse and community size mey 
?* more appropriately considered as ee 
Variables, for which profitability „shou Le 
adjusted, than as independent variables. pos 
“Xtension of this logic would call for the 

"'Ception of the situational characteristics 

Control variables, resulting in the con- 

"sion that there is no appreciable correlation 
petveen job satisfactions and performances 

“Yond what can be attributed to the relations 


that they have in common with the control 
variables. 

However, for the present, we are more 
inclined to adhere to the aforementioned 
conception of the situational characteristics 
as inputs or independent variables. When this 
view is taken, the results are not inconsistent 
with theory derived from previous evidence. 
Furthermore, the tentative retention of the 
model has the value of leading to further 
testing of its adequacy in this situation. We 
are now, for example, investigating employee 
performance and satisfaction under supervi- 
sors providing different degrees of considera- 
tion and structure initiation (Fleishman, 
1957), with hypotheses holding that the rela- 
tions between supervisory inputs and em- 
ployee outputs are functions of employee 
needs and expectations as reflected in our 
situational variables. The model also leads to 
efforts to improve the measures of employee 
needs and expectations, to the study of inputs 
that have sharper psychological content than 
our gross situational variables, and eventually, 
to experimental change in inputs to test 
whether hypothesized changes in outputs 
would indeed follow. 


SuMMARY 


The objective of this study was to illumi- 
nate the relationship between employee job 
satisfaction and performance, through a re- 
search design that conceived of these vari- 
ables as outputs of a system having as inputs 
the characteristics of the work situation. Data 
on employee job satisfaction, job perform- 
ance, and situational characteristics were 
obtained in 72 comparable, geographically 
decentralized warehousing divisions of a com- 
pany. These data were intercorrelated, using 
the division as the unit of analysis. The major 


findings include: 


1. Job performance is not a homogeneous 
characteristic. Measures of quantity of pro- 
duction per man-hour and profitability are 
intercorrelated, but quality of production and 
turnover are each essentially independent of 
the other performance measures. 

2. Employer job satisfactions, as meas- 
ured by questionnaire items, are significantly 


72 R. A. Katzell, R. S. Barrett, and T. C. Parker 


greater in those divisions which turn out the 
greater quantity of production per man-hour 
and which are more profitable. Job satisfac- 
tions are significantly associated neither with 
turnover nor quality of production, as meas- 
ured here. 

3. Five situational variables are intercorre- 
lated and may be represented by a general 
centroid factor characterized as urban vs. 
small town culture. These variables include 
community size, number of employees in the 
division, union representation, average wage 
rate, and proportion of employees who are 
male. 

4. Divisions whose situational character- 
istics are in the direction of the small town 
culture pattern typically have greater em- 
ployee job satisfaction and superior job per- 
formance (in terms of quantity of production 
and profitability); there is some trend for 
Such divisions also to have lower rates of 
turnover. 

5. Their correlational nature makes the 
results amenable to more than one interpreta- 
tion. The one preferred by the investigators 
regards the situational characteristics as inde- 
pendent variables, with job satisfaction and 
performance as dependent variables which are 
correlated because each is a function of the 


same situational characteristics; employee 
needs and expectations are postulated as vari- 
ables intervening between the situational and 
both the satisfaction and performance vari- 
ables. 


REFERENCES 
Bnavrmrp, A. H, & Crockett, W. H. Employ 
attitudes and employee performance. Psychol. 


Bull., 1955, 52, 396-424. 7 
Freisuman, E. A. The leadership opinion question- 
naire. In R. M. Stogdill & A. E. Coons (Eds) 
Leader behavior: Its description and measurement. 
Columbus: Ohio State University, Bureau of Busk 
ness Research, 1957. & 
Herzperc, F., MausNrm, B. Peterson, R. D, h 
CaPwELL, D. F. Job attitudes: Review of resediG 
and opinion. Pittsburgh, Penn.: Psychologic? 
Service of Pittsburgh, 1957. abiit 
Horst, P. A generalized expression for the reliabi 
of measures. Psychometrika, 1949, 14, 21-24. 
Karzett, R. A. Industrial psychology. Annu. 
Psychol., 1957, 8, 237-268. — New 
Marcu, J. G., & Srron, H. A. Organizations. 
York: Wiley, 1958. "EEUU 
Morse, Nancy C. Satisfactions in the white-c? a 
job. Ann Arbor: University of Michigan, Sut 
Research Center, 1953. ysis 
Wuyte, W. F. Money and motivation: An ona 
of incentives in industry, New York: Harper, 1 m 
Wortny, J. C. Organizational structure an 169^ 
ployee morale. Amer. sociol. Rev., 1950, 15, 
179. 


Rev: 


(Received May 9, 1960) 


Journal of Applied Psychology 
1961, Vol. 45, No. 2, 73-79 


PSYCHOMETRIC SCORE PATTERNS, SOCIAL 
CHARACTERISTICS, AND PRODUCTIVITY 
OF SMALL INDUSTRIAL WORK GROUPS 


THOMAS M. LODAHL 
Massachusetts Institute of Technology 


In a discussion of problems in the selection 
and classification of workers, Ghiselli and 
Brown (1955) point out that there are a 
number of possibilities for combining workers 
to obtain maximum effectiveness of a work 
team where the nature of the work requires 
cooperation among the members. In a labora- 
tory investigation of naturally occurring com- 
binations, Ghiselli and Lodahl ( 1958) found 
that the patterns of scores made by group 
members on scales derived from a self-descrip- 
tion inventory were related to group effective- 
ness in a task requiring close cooperation 
within the group. The self-description in- 
ventory (SDI) was one developed by Ghiselli 
(1954), and the scales used were Supervi- 
Sory Abilities and Decision-Making-Approach 
(DMA). For both the Supervisory and DMA 
scales, it was found that the most sedie 
groups were those in which only one mem 6 
of the group obtained a high score and a 
Others obtained uniformly low scores, thus 
resulting in a positively skewed distribution 
Of scores, The mean scores obtained by the 
8roups on each scale were not related to = 
Productivity of the groups. Ghiselli an 
Lodahl concluded that the balance or pattern 
Of scores within a group was more important 
than merely the amount of a psychometric 
Characteristic possessed by a ETOUP- 

The Ves of the present study is to ex- 
Plore the extent to which such distribution or 
Pattern effects are related to pel « z 
Industria] work groups. Ghiselli's ee y» 
"sed as the psychometric instrument, dE 
from the Ghiselli and Lodahl (1958) ie ; 
I was hypothesized that the mean level o 

Upervisory and DMA scores !n a "isi 
Would not be related to group performance, 


the management and 
s maintenance base 1n 
ation in making this 
de is due L. T. Long, 
Daubenmire. 


e The authors wish to thank 
8 bloyees of the United Airline 
Tancisco for their cooper 

Y Possible. Particular gratitu 


> "ollis Williams, and Robert 


AN: 


73 


D LYMAN W. PORTER 1 
University of California, Berkeley 


but that the heterogeneity and skewness of 
group scores would show a positive relation to 
performance. Since the industrial groups had 
formally appointed leaders, the relation of the 
leader’s position in the distribution of his 
group’s scores to the group’s performance was 
also assessed. In particular, it was hypothe- 
sized that the higher the leader’s score rela- 
tive to his group on the Supervisory scale, the 
better his group would perform. 

In an effort to attain greater understanding 
of the relationship between psychometric 
score patterns and group performance, the 
social characteristics of the group were also 
examined. In particular, it was reasoned that 
group cohesiveness might be an important 
mediator between score patterns and group 
effectiveness, in that the psychometric com- 
position of the group might affect its cohesive- 
ness, which in turn might affect its productiv- 
ity. Likewise, it was hypothesized that the 
group leader’s popularity with his men might 
affect the relationship between his position in 
the group distribution of psychometric scores 
and the group’s productive effectiveness. 

In the industrial situation studied, the 
groups were not uniform in the degree to 
which cooperation or teamwork was necessary 
in the performance of their work. Therefore, 
an analysis was made of the effect of this 
variable on the strength of the relationships 
stated above. Specifically, it was hypothesized 
that the strongest relationships would occur 
in groups where the necessity for cooperation 
was highest. 

METHOD 


Subjects and Research Setting 


The population used in this study was 567 shop 
workers organized into 62 groups employed at the 
chief maintenance base of an airline. They were from 
the power plant division which was responsible for 
engine overhaul, and the types of operations carried 
out included tearing down and disassembling engines, 
cleaning and inspecting used parts, replacing worn 


74 


parts, remachining some of the used BEE Epa 
bling engines, and finally, testing completely rs 
assembled engines. There were four major sections o; 
the division, each under a general foreman; an each 
section there were three or four “work centers,” each 
under a foreman; in each work center were four or 
five basic work groups, called “lead groups,” each 
under a *leadman." These lead groups varied in size 
from 4 to 13 men, with a mean of 9. 

All personnel at the foreman level and above were 
considered members of management and were not 
included in this study; leadmen and mechanics (the 
airline’s term for any nonsupervisory shop worker, 
regardless of specific duties) were members of a 
machinist's union. Although the leadman was a 
member of the union, he functioned as a "straw 
boss," being in effect the first level of supervision and 
having direct responsibility for the mechanics under 
him. To become a leadman, a mechanic must be with 
the company a minimum of 10 years, must pass a 
job knowledge test, and must have the greatest 
seniority among those eligible for the job. Leadmen, 
thus, were mechanics with high seniority who were 
familiar with the plant's operations. Despite the fact 
that they were nonmanagement, they were judged 
by the management of the power plant to have con- 
siderable influence over the performance and effec- 
tiveness of the men under them. 

The amount of cooperation required of the mem- 
bers of a lead group varied with the nature of the 
duties of the group. For some groups, the men 
worked cooperatively as a team; for other groups 
Some teamwork and some individual work was re- 
quired. In all groups, however, the men worked in 
close proximity with each other, and had frequent 
contact with each other and with their leadman. 


Procedure 


The men were tested in 
they arrived fi 


Thomas M. Lodahl and Lyman W. Porter 


Measurement of Psychometric Variables 


Ghiselli's — Selíf-Description Inventory (Ghiselli, 
1954) served as the psychometric instrument in this 
study. Two scales from this inventory were scored: 
Supervisory Abilities and Decision-Making-Ap- 
proach (DMA). These two scales were used in the 
Ghiselli and Lodahl study (1958) and are presumed 
to have some relation to the organization and control 
of group effort. The Supervisory scale is constructed 
of items that differentiate between the self-descrip- 
tions of individuals thought adequate for supervisory 
responsibilities and individuals considered inadequate 
for such positions (Ghiselli 1954). The DMA —€ i 
composed of items that discriminate between ur 
selí-descriptions of top- and middle manista 
personnel and which seem to describe how indivi e 
uals in these groups approach the decision-imallne 
process (Ghiselli & Lodahl, 1958; Porter & Gilly 
1957). This latter scale probably measures more po 
merely the type of approach to decision-making, 
does appear to reflect qualities which relate to grouP 
functioning. : 4 

To test the hypotheses outlined in the meg 
tion, indices for each lead group were calculated an 
both scales: (a) the arithmetic mean; (b) heteroge- 
neity, measured by the standard deviation of scores? 
(c) skewness, calculated by the formula 


3(4 — Md 
SD 


(this formula for computing skewness differs pue 
that used by Ghiselli and Lodahl, 1958, because the 
method was not appropriate for the, larger-sitn 
groups used in this study); (d) the leadman's w 
position in his group, measured by computing i- 
percentile score within his own group's score d 
bution. For Indices a, b, and c, the leadman's OV 
Score was excluded from the calculations. 


Measurement of Sociometric Variables 


The method used to measure cohesiveness in e 
Study is one suggested by Proctor and Loom 
(1951). The basic unit of this measure is a reciproc 
choice (RC) within a group. A reciprocal Goe ig 
one in which Person A chooses Person B, and B a 6 
chooses A. It is reasoned that groups with d 
reciprocal choices in proportion to their size are m 
cohesive than others. 
cohesiveness, 


variations in 


and Loomis 
the result wh 
fractional n 


: F xt 
Impossible; RCmax was therefore taken to the nC 
smaller integer in this study. se 

Another sociometric index computed from the 
data was (| 


obtained b; 


Small Industrial Work Groups 75 


Pla sapere indices were derived from the 

your lead E rite down the names of any men in 

Work i EAM whom you would prefer as your 

Were irent. Thus, these sociometric choices 

purely Seini toward work relationships instead of 
ones. 


M " R 
Measurement of Necessity for Group 
Cooperation 


Meses Were used to measure the degree to which 
sonnel enon was necessary. Supervisory per- 
ike Er with the work of the groups made 
rated eee and it was possible to have each group 
a ee two Supervisors: The ratings were made on 
the ee scale, with instructions being to rate only 
Tegard e necessity to work together, and to dis- 
Operated, T degree to which the groups actually co- 
On his oy Each supervisor’s ratings were standardized 
Was UM mean, and the average of the two ratings 
Operation as the measure of the necessity for co- 
Coeffici n for cach group. The interrater correlation 
65, Althea for these data ranged between .20 and 
Bated t ough these coefficients are low, this is miti- 
Used © some extent by the fact that the final data 
ratin; Were means, each based on two supervisors 
(1959) Also, recent evidence published by Buckner 
not ne Suggests that “high interrater agreement does 
cessarily imply predictable ratings and may 1n 

E Cases indicate a lack of predictability" (p. 64). 
differen: suggests that one reason for this is that 
orm; nt raters may observe different aspects of per- 
raters To thus leading to lower agreement amana 
Sampl ut a higher over-all validity, since the ri 
Omen, of behaviors observed is greater. Such a phe- 
sity non probably occurred in the ratings of neces- 
Or cooperation in the present study. Most of the 


the DS used have a variety of activities, and three of 
fores, pairs of ratings were made by a Won 
these an and a foreman; it seems reasonable P 
diffe, men each viewed the groups from somewha! 
Tent vantage points, based on different samples 
With © groups’ behavior. It was decided to proceed 
re the use of these ratings, based on the reasoning 
Sented above. 


M 
e - 
@surement of Productivity 

division was 


Sing 

the -CE the function of the power plant d 

po; Verhaul and repair of airplanes, it was not 
B Sible to obtain productivity data in me ya 
tiy, DStead, i btain a reasonably 0)0^- 
: Rhett standards data were 


Ye pr 
Oductivit; time 
d UP. ed time standards for 


Use, 
. he t 
Most company had compu 
i 9f the jobs i lant for cost account- 
i e i 
jobs in the power P! jlable showing the 


Puy i 

o Poses, and had data avaia. 

a Krall monthly percentage of time standards 
hieved by each lead group. These over-all monthly 
ity Chtage figures were used as the basic productiv- 
thay measure for each group. Because of menage 

: ally “tig! 

from, Pe time standards were not call (gue to 


digg... 9ne work ; to another center ( 
ffe rk center the productivity per- 


ences ; : 
Conta tees in types of jobs; 3 
Or each tad group was standardized on the 


mean percentage for the work cente i.d 
belonged. These work center Eo M aper 5 
served as the productivity data for the lead ead 
in computing correlations with other variables bd 
studied, and were based on the average percent: E 
for the 3 months immediately preceding the ala 
tion of the other data for the study. The reliability 
coefficient corrected by the Spearman-Brown ro 
mula and determined from the correlation betwe x 
the productivity data for the 2 months precedin x 
study was .78. This was considered satisfactory 
criterion reliability, since the time standards in thi 
shop were not used for pay purposes. Also. tho 
nature of aircraft engine maintenance requires hi n 
emphasis on quality and reliability of work. RN. 
secondary (but strong) emphasis on production; 
Under these conditions, variations in the pecus 
of incoming work will naturally affect the speed SR 
which the overhaul and maintenance work can b 
performed. These factors may explain in part EN 
the criterion reliability is not higher. 3 


RESULTS 


Product-moment correlation coefficients 
were used to evaluate the hypothesized rela- 
tionships. These are shown in Table 1. The 
size of the sample for these computations was 
55 groups. Data from 7 of the original 62 
groups were discarded because one or more 
members of these groups was ill or on vaca- 
tion at the time of data collection. Table 1 
shows that as expected, average level of group 
scores was not related to productivity in these 
groups. Heterogeneity of DMA scores as 
measured by the standard deviation of group 
scores was unrelated to performance, but 
heterogeneity of Supervisory scores was nega- 
tively related to group performance. This 
relationship is opposite to the direction hy- 


TABLE 1 
CORRELATIONS OF PSYCHOMETRIC SCORE 
PATTERNS WITH PRODUCTIVITY 

(N is 55 groups) 


Scale 
a c ee 
Score Pattern DMA Supervisory 
Group mean score 04 —.01 
.06 e Dax 


Heterogeneity of scores 

Skewness of scores 

Leadman's percentile 
position in group 


—.20 
—.10 


—.07 
—27** 


16 Thomas M. Lodahl and Lyman W. Porter 


i skewness of group scores was not 
aedi rad to productivity for either 
scale. Both of these coefficients Were negative, 
however, again in the opposite direction from 
that hypothesized. The leadman's Score posi- 
tion as measured by his percentile position in 
the group on the Supervisory scale was also 
negatively related to group productivity. For 
this relationship, the possibility existed that 

the leadman's divergence from his group's 
average was the significant. factor in group 
productivity rather than his actual position. 
Examination of the scatterplot did not reveal 
this to be the case: no curvilinearity was ap- 


parent in the scatterplot of leadman's per- 
centile versus productivity. 


In summary, group score patterns on the 
DMA scale showed no significant relation to 
group performance in this study. With the 
Supervisory Abilities Scale, heterogeneity of 
group scores, and leadman’s percentile score 
position in the group were related to produc- 
tivity. Both of these relationships were nega- 
tive, opposite to the direction hypothesized. 

To assess the possible mediating role of 
group social characteristics, the interrelation- 
Ships of Supervisory scale score patterns, 
Sociometric variables, and productivity were 
calculated using product-moment coefficients. 
These are presented in Table 2. Tt can be seen 
in Table 2 that cohesiveness tends to be nega- 
tively related to Heterogeneity of Supervisory 
Scores and positively related to Productivity. 
The index of leader popularity likewise tends 
to be negatively related to the Leadman's 
percentile position and positively related to 
Productivity. These results indicate limited 


TABLE 2 
IwTERRELATIONSHIPS AMONG 


SUPERVISORY SCALE SCORE 
PATTERNS, Soctometric VARIABLES, AND 


Propuctiviry 


Sociometric Variables 
ee M EAR 


Cohesive- Leadman's 
ness Popularity 
Supervisory Scale Score Patterns 
Heterogeneity —.28% —24* 
Skewness : —21 .08 
Leadman's percentile position —.07 —.20 
Productivity +19 24% 
* p «.10. 
5.05 


TABLE 3 
CORRELATIONS BETWEEN PREDICTOR VARIABLES AND 
PRODUCTIVITY or Groups CLASSIFIED BY DEGREE 
OF NECESSITY FOR COOPERATION AMONG GROUP 
MEMBERS 


Necessity for Cooperation 
Predictor (vs. Productivity) 


Low Medium High 
5 —,53* 
Heterogeneity (Supervisory scale) .05 1 zi 
Skewness (Supervisory scale) OL x E 
Leadman's percentile position "em 
(Supervisory scale) =e =i a 
Cohesiveness — A 10 p 
Leader popularity. ae ae 16 
N (of groups) 
* p «.05 
** p <.01 


support for the idea that score patterns e 
part of their influence on group EN 
through their relation to the social characte 
istics of the group. : 

The ise ee of the technical necessity ur 
cooperation among group members on X ue 
lationships of score patterns and social c ^ 
acteristics to Productivity were examined 7 
breaking down the total sample of greon 
roughly into thirds by the degree of ra ga 
necessity for cooperation and computing m 
relations separately for each subsample. 3 
results of this analysis are shown in Table i 
(Numbers of groups shown are smaller es 
in preceding tables because ratings of "We 
sity for cooperation were not available for k f 
groups.) In this table, all of the significa es 
relationships appear in groups in which PER 
sity for cooperation is high. One predictor 
Leadman’s percentile position, showed signi i 
cant relationships in preceding analyses bu 
failed to reach significance in this p ina 
although its correlations are moderately um 
and consistently negative over all degrees O 
necessity for cooperation. 


Discussion 


The results obtained in this Hive 
Show that for the total sample of industria 
work groups studied, patterns of group scores 
on the DMA scale of Ghiselli’s SDI were ar 
related to group productivity, but certain pat- 
terns on the Supervisory scale were related to 
group performance. These Supervisory score 
patterns also tended to be related to group 
cohesiveness and to the leadman's popularity; 
these were in turn related to Productivity in 


Small Industrial Work Groups Ti 


such a way as to suggest that part of the ob- 
tained relation of score patterns to Productiv- 
ity is mediated by the social characteristics of 
the group. Taken in sum, the results indicate 
that psychometric pattern indices in this in- 
dustrial study are not related to group per- 
formance in any simple manner predictable 
from the laboratory study. The findings seem 
to point instead to the existence of a complex 
interaction among psychometric patterns, S0- 
cial factors, and group productivity. While 
the obtained correlations are low and run in 
unexpected directions, it seems useful to ex- 
plore possible bases for these relationships, 
utilizing the data on social characteristics and 
necessity for cooperation. 

As a whole, these findings can best be 
understood if the nature of the leadman's 
Dosition is taken into account. He is nom- 
inally in charge of his group, but is not offi- 
cially a member of management; in fact, he 
is a union member, as are his group meu 
Members of management of the shop state 
in interviews that the leadman has direct 
responsibility for the performance. of is 
group and that he does in fact exercise gres 
influence on group performance: Yet in t 4 
Shop the leadman is given little if any forma 
Power over the group in terms of manage 
Ment-authorized sanctions such as P duse 
hire, discharge, promote, etc. Lacking er 
formal means of control, it seems tha » 
€adman must resort to informal, interp 
Sonal ways of influencing his group. inan 

It makes sense then that those 'eà u 
Who are sociometrically popular with Sie 
Members would be able to exercise gre 


; n 
Whose Supervisory score is hig? "S his men 


toup is likely to be less popul 
(see Table 2), and thus not 
his means of control. This seems zo abt 
Situation in which a trait (supervs® y 
ity) which would ordinarily | 
leader becomes a liability, £V 
Influence available to him. Here; he trait, he 
iffers too much from his men OF ns his in- 
ends to Jose his popularity and "Pi esting 
uence, In this connection Ji n leader 
at the only strong relation betwee aR 
Popularity and group productivity heim 
TOups in which a high degree of coope 


among members is necessary. The type of 
leadership employed by the highly chosen 
leader seems to be a significant influence only 
in situations where coordination of the efforts 
of individuals is important in determining 
group output. 

Heterogeneity and group cohesiveness be- 
gin to fit in if we suppose that a highly cohe- 
sive group has a greater ability to govern 
itself than a less cohesive one; control of 
members’ behavior by the group is possible. 
The leadman’s problem in this case is to guide 
this force in directions beneficial to the total 
organization, while not losing his own influ- 
ence with the group. Greater heterogeneity of 
group supervisory scores generally is asso- 
ciated with low cohesiveness, which in turn is 
associated with low productivity. 

"These relationships can be understood when 
the patterns of choice in highly cohesive 
groups are examined. It appeared that those 
groups scoring high on the cohesiveness meas- 
ure had for the most part oriented their 
choices toward the leadman, who had in turn 
reciprocated most of these choices. Statis- 
tically, this was tested as the relation between 
leader popularity and group cohesiveness: the 
correlation coefficient was .42, significant be- 
yond the .01 level. It may be that this social 
pattern provides one of the bases for the lead- 
man's informal influence, in that in the cohe- 
sive groups he may function as a social “gate- 
keeper,” rewarding by social choice those who 
contribute and punishing those who do not by 
rejection. If he controls the social pattern to 
this extent, this could also give the means of 
guiding the power for control implied by 
group cohesiveness to benefit the total organ- 
ization. This also helps explain why cohesive- 
ness is positively related to productivity, if 
we assume most leadmen hold goals of high 
productivity. Again here it is important to 
note that the only significant relationships 
occur in groups Where a high degree of co- 
operation is necessary. Apparently on rela- 
tively independent jobs group psychometric 
and social characteristics are less important in 
determining productivity. 

To some extent these findings parallel those 
of the Michigan studies of productivity, 
supervision, and morale (Kahn, 1956; Sea- 
shore, 1954). The earlier studies, reviewed by 


78 Thomas M. Lodahl and Lyman W. Porter 


Kahn, indicated that there is more relation- 
Ship between supervisory attitudes and pro- 
ductivity than between group attitudes and 
productivity. Although the leadmen in the 


gation of group cohesiveness and productivity, 
it was found that highly cohesive groups had 


and that the 
a function of 
larger organ- 
groups in which 
high showed a 
ohesiveness and 
tude data were 
osphere in the com- 
likely that most of 
ong positive atti- 
and to the plant 


cohesiveness; in 


Conctustons 
The results obtain 


study suggest that 
can b 


c £roup cohesive- 
opularity, are important in 
€rstanding of the nature of 
the relationships between Score patterns and 
productivity, Furthermore, it is also clear 
from these results that the strongest effects of 


ness and leader p 
attaining an und, 


Score patterns on productivity will be found 
in groups where necessity for cooperation is 
high. Finally, on the basis of reasoning pre- 
sented in the discussion section, it seems 
likely that some consideration must be made 
of the managerial situation of the group 
leader (in terms of sanctions available to 
him) if a clear understanding of this over-all 
problem is to be attained. 


SUMMARY 


This study was concerned with the idea 
that group productivity in industrial lige’ 
groups may be related to the patterns E 
psychometric scores formed by combining in- 
dividuals into groups, and that such score 
patterns may exert part of their influence 
through affecting social characteristics of m 
group. Data were obtained from members 0 
55 industrial work groups on Ghiselli’s Self- 
Description Inventory and a sociometrie 
questionnaire. For the Supervisory Abilities 
scale of the Self-Description Inventory, het- 
erogeneity of group scores as measured bY 
the standard deviation of Scores was nega- 
üvely related to productivity, as was the 
leader's percentile score position within his 
own group. These variables were also nega- 
tively related to group cohesiveness and the 
leader’s sociometric popularity with his men; 
Cohesiveness and leader popularity were M 
turn positively related to productivity. Using 
ratings of "the necessity for group coopera- 
tion” in performing the group task, it was 
found that the Strongest relationships between 
the predictor variables and productivity were 
in groups where necessity for cooperation was 
high. It was concluded that patterns of psY- 
Chometric scores in industrial work groups 
may bear some relation to group productivity; 
but this relation is affected by social char- 
acteristics of the group and the relation of the 
group to the leader. These score pattérn ef- 
fects and social influences on productivity are 
Strongest in groups where the work situation 


requires a high degree of cooperation among 
group members, 


REFERENCES 


Buckner, D, N. The Predictability of ratings as à 
function of inte 


rrater agreement, J, appl. Psychol, 
1959, 43, 60-64. 


Sy 


Small Industrial Work Groups 79 


Porter, L. W., & Guiseti1, E. E. The self-percep- 
tions of top and middle management personnel. 
Personnel Psychol., 1957, 10, 397-406. 


GuisrLLr E. E. The forced-choice technique in per- 
sonnel selection. Personnel Psychol., 1954, 7, 201- 
208. 

3 . H., & Loomis, C. P. Analysis of socio- 

Guisetir, E. E, & Brown, C. W. Personnel aud in- Proctor, C. H., ] ly 
dustrial ps cholory New York: McGraw-Hill, metric data. In M. Jahoda, M. Deutsch, & S. H. 

3 diii -Cook (Eds), Research methods in social rela- 


1955. > d ” 
" tions. Part 2. New York: Dryden, 1951. 
Guisextr, E. E, & Lopant, T. Ms P m Rs Srasuong, S. E. Group cohesiveness in the industrial 
"o EA M. b £r ETEN eness, J. t work group. Ann Arbor: Survey Research Center, 
C. Psychol., 1958, 57, 61-66. 1954. 
K icti f productivity. J. soc. 
AHN, R. L. The prediction of produ (Received April 2, 1960) 


Issues, 1956, 12, 41-49. 


al of Applied Psychology 
[i Vol. 45, No. 2, 80-85 


PSYCHOLOGICAL VERSUS SOCIOLOGICAL VARIABLES 
IN STUDIES OF VOLUNTEER BIAS IN SURVEYS 


C. R. BELL? 


London School of Economics and Political Science 


Investigators in the fields of opinion, mar- 
ket, and social survey research must consider 
the effects of basing their analyses on data 
obtained from less than 10076 of the popula- 
tion sample chosen for study. The validity of 
extrapolating results of these analyses to the 
not-reached part of the population must also 
be examined. There is little evidence to deny 
that volunteers differ (other than in their 
volunteering behavior) from others whose at- 
titudes, preferences, opinions, and behavior 
have not been studied in the survey. Unfortu- 
nately it is also undeniable that there is little 


conclusive evidence of the ways in which they 
do differ. 


Some investi 
tical analysis 
turns to mail 
to estimate th 
sults 


1 Present address: Medical Research 


c Council Cli. 
mate and Working Efficiency Unit Depa: 

rt 
a Anatomy, South Parks Road, [cem ae 
land. 


The present paper seeks to draw together 
and examine the findings of the various em- 
pirical studies of volunteer bias in order that 
Some assessment might be made of the ade- 
quacy of the information at present available. 
This data source has been supplemented by 
the inclusion of hypotheses published by a 
search workers who have based them on ex- 


z > r 
perience and direct observation of voluntee 
subjects. 


PROBLEM 


When the percentage of the sample b 
providing data is small, it may sometimes ud 
justifiably assumed that its exclusion from t ; 
analyses will have an almost negligible ei 
upon the final conclusions of the survey- xd 
general, however, it may be that serious be. 
and distortion in survey results become mor! 


"likely as the proportion of the sample nO 


reached becomes greater. t 
In studies of the population of M 
Britain there may be a loss of 596 or mos 
(Gray & Corlett, 1950; Moser, 1949) due a 
prolonged illness, absence from home, mo" , 
to another district, or death. In addition “4 
these losses from the sample, there is also 4 
loss of those who are reached but who b. 
Plicitly refuse to cooperate. With a We E 
designed survey by interview or mail ques 
tionnaire, with well-trained interviewers, vr 
adequately piloted question schedule, Wl 
real incentives, the number of refusers may ^ 
kept to a minimum. Even with several follow 
up (call-back) stages, however, a large pIO 
portion of the percentage lost may be due t° 
nonavailability of the persons chosen 0% 
Study. The hard-to-reach individuals aê 
Often lost because the survey administrator 
cannot afford the time and extra expense e 
Volved in pursuing them to contact (Hilgar 
& Payne, 1944; Lundberg & Larsen, 1949). 
It is not rare to find reports of survey? 
& Ford, 1947; Norman, 1948) in 
S than one-half of those originally 


(Clausen 
which les 


TE 


Volunteer Bias in Surveys ' 81 


chosen for study have provided data for anal- 
ysis. If the research worker is to establish 
control and weighting systems, then it is 
essential that the effects of incomplete cov- 
erage of the survey sample population be 
investigated and understood. Basically, the 
interest of most research workers is not in 
obtaining a picture of all the ways in which 
those who provide data differ from those who 
do not, but rather their concern is with those 
differences which have a particular relevance 
to the topic they are studying. It is difficult 
to predict, a priori, what these will be in any 
given investigation. Research workers may, in 
Consequence, try to achieve some degree of 
control by examining the data-providing (vol- 
unteer) part of the sample and weighting the 
distributions of those variables for which 
there is information available for the parent 
Population (Clausen & Ford, 1947; Ford & 
Zeisal, 1949; Rollins, 1940; Shuttleworth, 
1941; Stanton, 1939; Zimmer, 1956). But 
attempts at securing trepresentativeness , by 
weighting scales and other correction devices, 
such as taking early versus late respondent 
differences as a guide to replier versus non- 
replier differences, may be invalid unless it 
can be shown that weighting is in terms of 
“characteristics relevant to the study” (Fer- 
ten 1949). i 
espit ings that sample Sit 
“on the basis e objective indices alone 
[which are] largely sociological in Le 
may not be sufficient” (Maslow & Sakoda, 
1952), many survey researchers still seem to 
assume that a sample which a E par 
ent population’s distri y 56 
cupational level, social class; and household 
Composition, invariably 
asis for generalization of the ^ with an 


e stratification 


.to tj lation. 
Be general Dong f achieving complete 


Optimistic assessment O ; 
tepresentativeness in terms © 

( tt 
ables (Crossley, 1941; Gray & Corlett, J 
Moser, 1949, 1955), the problem de a 
assessing the relevance O each one 0 


th i i references, and opin- 
ions in which. ee is interested. 


ions in which the researcher ! 


Sociological Variables dentify char 
made to identify cna” 
Attempts have beer dental with some forms 


acters which are coinc! 


of volunteering. There have been investiga- 
tions (e.g., Kruglov & Davidson, 1953; Rosen. 
1951; Siegman, 1956) to discover basic per 
sonality traits associated with the willingness 
and desire to volunteer, and studies (eg., 
Crossley & Fink, 1951; Hilgard & Payne, 
1944; Reuss, 1943) of the secondary char- 
acteristics, usually sociological, of the section 
of the population who volunteer. Unfortu- 
nately the latter studies have been, in the 
main, confined to post hoc investigations 
which have limited application beyond the 
situation which gave rise to them. Variables 
studied and conclusions drawn are rarely the 
same in any two reports. Because of this 
specificity, it is difficult to draw many con- 
clusions about general volunteer character- 
istics of a sociological nature. The information 
presents an ambiguous picture. In certain 
studies some variables are shown to be asso- 
ciated with bias and in others they are shown 
to be not associated. Some of the variables 
cited may be factors relevant to physical 
availability at the time of the survey. 

In considering the apparently quite differ- 
ent forms of behavior which are included 
under the concept “volunteer,” the difficulty 
of establishing universal sociological variables 
becomes understandable. Homogeneity in any- 
thing but the word "volunteer" is not easily 
seen in: (a) volunteers who accept and keep 
active membership of a listening, viewing, or 
household budget panel; (b) volunteers who 
answer questions put to them by a charming 
person who interviews them in the home or 
in the street; (c) volunteers who leave their 
homes to participate in depth interviews, 
taste-testing sessions, group discussions, or 
program previews; and (d) volunteers who 
complete and return mail questionnaires. 
Apart from reviews (Clausen & Ford, 1947; 
Norman, 1948) of studies in volunteer (re- 
spondent) bias in mail-questionnaire investi- 
gations, there has been no attempt to examine 
and compare systematically, the nature of 
volunteering in all these situations. 

A further cause of confusion in the ap- 
praisal of studies using sociological variables 
is that investigators, it appears, have not dis- 
tinguished between variables relevant to the 
behavioral predisposition to volunteer and 
variables associated with a person's avail- 


2 C. R. Bell 


TABLE 1 


VARIABLES EXAMINED FOR THEIR ASSOCIATION WITH VOLUNTEER Bras 


PSYCHOLOGICAL VARIABLES 


Terms used to describe the volunteer 
Better adjusted 
More articulate 
Less authoritarian 
Greater candor 
Less conventional 
More conscientious 
Curious 
Less defensive 
Greater democratic potential 
More drive 
Ego-satisfied status 
Less ethnocentric 
Favorable attitude to Negroes 
Flexible in interpersonal relations 
Greater frankness 
Habits of promptness 


More interest in the topic 
Lonesome 

Less nervous 

Greater optimism 

More poise 

More polite 

Less power preoccupied 
Less rigid 

More self-assurance 

More self-discipline 

Higher self-esteem 

More sexually active 

Less stereotyped in thinking 
Less tendency to projection 
Greater tolerance of others 


Volunteering affected by 
Age 
Church going and religious ties 
Ethnic background 
Home ownership 
Household composition 
Marital status 
Occupational status 


For mail questionnaire studies 
Questionnaire design and length 
Covering letter 
Efficiency and orga! 
Method of delivery 
Respondent attit 
Anonymity, 

For face-to-fac 
Attractiven, 
Observatio; 
Private or publi 
Stimulus-reques 


ude to sponsor 
task, and reward 


ess of alternate situation 


c conditions of vo 
t intensity 


€ requests for volunteers 


n of the reactions of others 


SOCIOLOGICAL VARIABLES 


Ownership of specific goods 
Rural or urban background 
Sex 

Socioeconomic status 
Telephone ownership 

Years at school 


OTHER VARIABLES 


nization of follow-up stages 


lunteering 


ability at the time of the surve 
the interviewer calls; after the 
mail questionnaire; at the d 
the group Session), le 1 it may be 
Seen that some variables : religious affiliation 
(Wallin, 19493, 1949b). ethnic background 
(Krugloy & Davidson, 1953; Pan, 1951), and 


Y (Le, when 
receipt of the 
ate and time of 


urban or rural family background (KrugloV 
& Davidson, 1953; Reuss, 1943; Wallin, 
19493), may be associated with the volun- 
teering Syndrome, Others, for example: pi 
phone Ownership (Wallace, 1954), marita 
Status (Wallin, 19493; Zimmer, 1956), and 
household Composition (Hilgard & Payne 


yä 


Volunteer Bias in Surveys 83 


1944), may have a closer association to 
factors of “contactability” or “time-to-spare- 
ability.” Age, frequently examined as either a 
psychological or sociological variable, seems 
often to be significant in relation to the time 
spent at home of the various age groups 
rather than in relation to a desire to volun- 
teer or avoid volunteering. Thus young moth- 
ers and elderly retired persons may decep- 
tively appear to be «yolunteers” when merely 
they are those who are rarely not-at-home 
when the interviewer calls. 

A purely quantitative approach (Belson, 


. 1959; Clausen & Ford, 1947) to the identifi- 


Cation of variables associated with volunteer 
Bias in surveys may tend to mask a distinc- 
tion between factors of volunteering and 
availability. Weighting on the basis of numer- 
ical analysis alone contributes little to the 
Understanding of sources of bias specifically 
arising from the use of volunteer subjects. 
Consideration of the ways in which volunteer 
las operates is essential to the generalization 
of findings from one volunteering situation 
to another, If the relevance of variables in 
Sample stratification and weighting devices is 
not understood pseudo-corrections may be 
Producing more distortion in the results of 
analysis than in the original data. 


Psychological Variables 
The i univer 
search for basic or ight be described 


Y which the volunteer n" 


t has led yc 
5 to the use of PSY 
lories and personality tests. Volunteers have 
DM examined in ternis of their m 
e Berk 7 pEC Scale cale 
erkeley F Scale, ); the Minnesota 


951); the Taylor Manifest Anxiety n 
Himelstein, 1956; jegman, 1956); i 
Scales of 2 garession-conventional'ty y" de 
ienServatism-liberalism (Wallin, i ) i ae. 
nsi Pn i n, 
siveness, rigidity (Sies™ 2; Siegman, 


esteem (Mash koda, 19 

ow & Sakoce, 
1956). and social participation (Gough, 
1952. Lundberg & Larsen, 1949). It is ne 
Ortunate that in all studies except om 


s were 
Lundberg & Larsen, 
9t drawn from a repres 


949) volunteer 


entative section of the 


general population. In addition, it should be 
noted that the situations for which volunteers 
were requested were not of the kind usually 
found in the field of market and survey re- 
search. These considerations severely limit 
the generality of the findings unless it can be 
shown that volunteering motivations are the 
same in all subsections of the general popula- 
tion and in all situations for which volunteers 
are requested. On the evidence to hand this 
proposition seems unlikely to be supportable. 
In most of the studies using personality tests 
it is difficult to translate findings. expressed in 
test jargon into language which is more di- 
rectly appropriate to those working in situa- 
tions outside the context and orientation of a 
particular psychological laboratory. It is, for 
example, not easy to convert into behavioral 
terms of, say, marketing or program prefer- 
ences such descriptions of the volunteer as 
one having “a greater self-discipline and tol- 
erance of others" (Gough, 1952), a higher 
democratic potential" (Kruglov & Davidson 
1953), or “less tendency to projection" 
(Rosen, 1951). 

Though these studies may go some way 
towards the delineation of the volunteer per- 
sonality and his motivations, they leave a 
wide gulf between their findings and the 
immediate needs of the investigators in the 
field to whom the problem of volunteer bias 
is real and acute. Attempts to bring together 
laboratory findings and field needs are most 
urgently required. The splendid isolation of 
the laboratory worker’s test jargon and the 
sporadic, ad hoc, approach to the subject of 
field workers serve only to aggravate the lack 
of communication between the academic and 
field approach to problems of mutual interest. 
There have emerged however, several general 
factors which appear to cut across this divi- 
sion. From an examination of the literature it 
may be suggested that the volunteer: (a) will 
tend to have a greater interest in the topic 
being studied; (5) will tend to be better 
educated or have spent more time at school; 
(c) will be able to talk without embarrass- 
ment about his social, economic, occupational 
status; and (d) will tend to have a favorable 
attitude to the survey sponsor or representa. 
tive. These variables appear to be not inds- 
pendent of each other. 


84 


Other Variables 


A set of factors related to response rates 


or volunteer bias are those concerned with the 


nature of the invitation, the magnitude of the 
effort demanded of the volunteer, and the 
administration and technique of the survey. 
Studies dealing with Tesponse rates to mail 
questionnaires (Clausen & Ford, 
Mitchell, 1939; Norman, 
1954) suggest the relevance 
length and design; the mode of delivery; the 
timing of follow-up letters ; and the respond- 


ent's task, reward, and anonymity. Studies of 
response to face- 


volunteers indic. 
tensity of the 


activeness of alternate behay- 


(Blake 
Rosenbaum, 1956; Rosenbaum 


Discusston 
The most obvious deficienc 
ture is a report of 
ically planned investi 


“Volunteer,” At 
widely diverse 
basis of the 


' C. R. Bell 


many occasions what has been taken to be 
purposeful wish to volunteer has been no 
more than availability at the time of the sur- 
vey and a lack of awareness of, or strong 
feelings about, being used as a subject. The 
characteristics of this type of so-called volun- 
teer may differ quite markedly from the 
characteristics associated with an active will- 
ingness or desire to volunteer. Although ques: 
tions of individual motivations to volunteer 
would seem to be an important part of any 
consideration of the subject, little information 
or even suggestion is to be found. Greenberg 
(1956) has suggested that people who vols . 
teer do so because they are either too poli 
to refuse, or curious, or lonesome. e 

Studies in which psychological tests bav 
been used have underlined the laborato n 
worker’s ineptitude in communication ke 
others who do not share the particular orie 
tation he has to questions of personality i f 
the tests used to investigate it. The lis ie 
Sociological variables produces an incomp s 
and ambiguous picture. The reasons why E 
a variable is relevant as, for instance, A 
ligious affiliation and church-going (Re 
1943; Rosen, 1951; Wallin, 19494, 194 a 
should be understood before that variable F 
included in a weighting device aimed af ex 
Correction of volunteer bias, A whole e 
of similar variables, identified as having $0 a 
association with survey response through ^ ee 
tistical analyses of returns, gives no guaran! n 
of correction of bias in the informa 
gathered unless there is some understanding 
of the relationship between these variab » 
and the kind of information given. “Blin t 
Weighting may aggravate bias already prese?" 
or may introduce new sources of bias unreco£ 
nized by the Survey investigator. 


CONCLUSIONS 


tempts to discove; 
bias provide man 
conclusions, Bef 
cation schemes 


r the nature of volunteer 
Y hypotheses and almost no 
ore adequate sample stratifi- 
and correction devices can b€ 


yà 


` i Volunteer Bias in Surveys 85 


drawn up it is necessary for the apparent 
heterogeneity of volunteering behavior to be 
examined. There is also a need to clarify the 
distinction between a willingness (desire) to 
volunteer and a physical availability together 
with the inability -to refuse to cooperate. It 
seems that neither a purely psychological, nor 
purely sociological, nor purely statistical ap- 
proach to the study of volunteer bias is ade- 
quate to deal with the problem. This appears 
to be one subject in which an interdiscipline, 
laboratory and field, investigation would be 
most useful. 
REFERENCES 


Berson, W. A. Matching and prediction in the social 
Sciences. Nature, Lond., 1959, 183, 772. 

LAKE, R. R, Berkowitz, H., BELLAMY, R. Q3 & 
Movrton, J. S. Volunteering as an avoidance act. 
J. abnorm. soc. Psychol, 1956, 53, 154-156. — 

Crausen, J. A, & For, R. N. Controlling bias in 
mail questionnaires. J. Amer. Statist. Ass., 1947, 
42, 497-511. 

Crosstey, H. M. Theory and ap 
Sentative sampling as applied 

cuarhet,, 1941, 5, 456-461. 
Rosstry, H, M, & Fink; R- Resp 
Tesponse in a probability sample. J” 
tude Res., 1951, 5, 1-19. 

Ferner, R, A rejoinder. Publ. opin. Quart., 
562-563. 

Fono, R, N., & Zesar, H. Bias in 3 
not be controlled by one mal 
Quart., 1949, 13, 495-501. 
pa H. G. Predicting socia 

Sychol., 1 5, 227-233. r 
RAY, P, s T. Sampling for the nee 
Survey, J, Roy. Statist. 306» Ser. A, pes 


150-199, ; nt in 
Grrennero, A, L. Respondent ego invo re 500-393 
large scale surveys. J- Markets 1956, cr SPETA 
Garp, E, R, & Payne, S. Le Those Po 1944, 8, 
Riddle for pollsters. Publ. opin. Quarts z 
254-261, istics of volun- 
IMELSTEIN, P. Taylor scale charactere dE 
teers and nonvolunteers for psycho 26 52, 138- 
Tents, J. abnorm. 50€. i Pr P 


plication of repre- 
to marketing. Ts 


onse and non- 
t. J. opin. atti- 


1949, 13, 


mail surveys can- 
ling. Publ. opin. 


1 participation. J. soc. 


139, * 
Krucroy, L, p, & Davinsox, H. H. Ts hx 
to be interviewed: A selective factor 


J. soc. Psychol, 1953; 38, 39-47 


LUNDBERG, G. A., & Larsen, O. N. Characteristics of 
hard-to-reach individuals in field surveys. Publ. 
opin. Quart., 1949, 13, 487494. i 

Mastow, A. H., & Sakopa, J. M. Volunteer-error in 
the Kinsey study. J. abnorm. soc. Psychol, 1952 
47, 259-262. i 

Mrrcuetrt, W., Jr. The rate of return on mailed 
questionnaires. J. Amer. Statist. Ass, 1939, 34, 
683-692. 

Moser, C. A. The use of sampling in Great Britain. 
J. Amer. Statist. Ass., 1949, 44, 231-259. à 

Moser, C. A. Recent developments in the sampling 
of human populations in Great Britain. J. Amer. 
Statist. Ass., 1955, 50, 1195-1214. 

Norman, R. D. A review of some problems related 
to the mail questionnaire. Educ. psychol. Measmt. 
1948, 8, 235-247. : 

Pan Ju-Suu. Social characteristics of respondents 
and non-respondents in a questionnaire study of 
later maturity. J. appl. Psychol., 1951, 35, 120-121. 

Reuss, C. F. Differences between persons responding 
and not responding to a mailed questionnaire. 
Amer. sociol, Rev., 1943, 8, 433—438. 

Rorrms, M. The practical use of repeated question- 
naire waves. J. appl. Psychol., 1940, 24, 770—772. 

Rosen, E. Differences between volunteers and non- 
volunteers for psychological studies. J. appl. Psy- 
chol, 1951, 35, 185-193. 

Rosenbaum, M. E. The effect of stimulus and back- 
ground factors on the volunteering response. J. 
abnorm. soc. Psychol., 1956, 53, 118-121. 

RosexBAUM, M., & Braxe, R. R. Volunteering as a 
function of field structure. J. abnorm. soc. Psy- 
chol., 1955, 50, 193-196. 

SHUTTLEWORTH, F. Sampling errors involved in in- 
complete returns to mail questionnaires. J. appl. 
Psychol., 1941, 25, 588-591. 

Srecman, A. W. Responses to a personality question- 
naire by volunteers and nonvolunteers to a Kinsey 
interview. J. abnorm. soc, Psychol., 1956, 52, 280- 
281. 

Sranton, F. Notes on the validity of mail question- 
naire returns. J. appl. Psychol., 1939, 23, 95-104. 
Wattace, D. A case for—and against—mail ques- 

tionnaires. Publ. opin. Quart., 1954, 18, 40-52. 

Warrm, P. An appraisal of some methodological 
aspects of the Kinsey report. Amer. sociol. Rev. 
1949, 14, 197-210. (a) a 

Warr, P. Volunteer subjects as a source of sam- 
pling bias. Amer. J. Sociol., 1949, 54, 539-544. (b) 

Zimmer, H. Validity of extrapolating nonresponse 
bias from mail questionnaire follow-ups. J. appl 
Psychol., 1956, 40, 117-121. 3 


(Received April 11, 1960) 


Journal oj Applied Psychology 
1961, Vol. 45, No. 2, 86-90 


A HIERARCHY OF 
OF GEOMETRICAL 


“PERCEPTUAL USEFULNESS” 
CUES IN AN OVERLEARNED 


DIAL-READING TASK? 


HILDE GROTH asp JOHN LYMAN? 


University of California, Los Angeles 


The utility of particular classes of geo- 
metrical cues for training on dial-reading 
tasks seems to be relatively unexplored. Al- 
though Psychological studies of dial design 
variables have a history of approximately 20 
years and the resulting literature is volumi- 
nous, little emphasis has been directed toward 
changes Occurring in the usefulness of a given 
perceptual cue structure during a prolonged 
training period, T 


signed to determine a rank order of perform- 


stigation are a number 
conducted within the frame- 
information theory (Anderson & 
Rappaport, 1957; Senders & 

€ terms informat. 


Work of 
Leonard, 


1 This investigatio 
nology laboratory 
the Office of Ni 


En; 
Branch, under r2: ene Pe 


ychology 


gratefully ack 


Winter, engineering aide 
structed the test booklets, collected the 


istical analyses, 


86 


framework outlined by Bruner (1957) ee 
considers perception as a Pedir 
process affected by the situational gi ga 
available discriminatory cues. His pine 
Provided the basis for our experimental " 
pothesis that the level of petuum 
directly related to perceptual cue I 
and only incidentally to the amount 0 udi 
formation, redundancy, and noise in a gi 
task. 


METHOD 


valid 
; -— y a va 
Rationale. In order to develop eventually à 


empirical system of “cue-usefulness” hierarchy ys 
the general class of dial-reading tasks, we WE was 
one extreme case for the initial step. A tas hal 
chosen to fulfill the assumption that all ^ p 
reached a final performance criterion of 109 ° his 
curacy if conventional cues were presente "ok an 
Criterion was met by substituting a clock face 
arbitrary dial w 3 . [t al 
laboratory abstraction of a dial-reading task. arioUs 
permitted systematic addition and deletion beer. Í 
types and amounts of geometrical signal Bacy cone 

Task. Examination booklets were prepare ge 
laining 24 pages with 12 “dials” of the Same dock 
metrical configuration on each page. The 288 f rane 
times” were determined by means of tables o ura" 
dom numbers. Order of the geometrical config al- 
tions for each type of background was count f 
anced over Ss, Figure 1 represents a compos! 
typical dial configurations and backgrounds 
ployed. 


le 
A cepta 
hich can be considered an acceP ko 


Hn 
All conventional Wr 
display. “Usefulness 


fae" 
geometrical cues was explored in an 8x 3 X 21 


intel 
e07 


5 "T" Lin 
Ps in the correct positicn bu 
an unstructured field 
2. Addition of X, y 


axes to the blank field 
3. Enclosure of poi; 


nter tips by a perimeter 
4. Addition of heavy dots to emphasize the 1 
6, and 9 o'clock positions 

5. Blank background 

6. Cross-hatched background 

7. Dotted background 


2,5 


FR " m- 
Criterion measures were obtained: (a) nu 


Two 
a 
ber of correct responses (a correct response W' 


Usefulness of Geometrical Cues 


H 
t 
: 
t 
t 


+ 
jl 


Fic. 1 Typical dial configurations and three types of background. 
TABLE 1. 
AND CoMPLETED RESPONSES, AND THEIR CORRELATION 


AGE CORRECT 
AVERAGED OVER BACKGROUNDS 


RANK ER, PERCENT 
EX Cea Arp TREATMENTS, 


For À 
Correct Responses Completed Responses 

Rank Order 96 Rank Order % » 
F 8 40.0 5 81.3 35 
Pointers 5 5147 4 82.2 74 

„Pointers + Perimeter 2.5 55.9 2 84.9 5 
Pointers + Reference Axe? 6.5 48.3 8 77.9 " 
Pointers + Dots T 
Pointers + Perimeter 25 559 3 83.6 i 
-+ References Axes 1 56.6 1 85 p 
Pointers -+ Perimeter + por m 67 
Pointers + Dots + Referenc: ds 483 7 SUR » 

Axes 

»aetét 

Pointers + Dots - Perimeter M 525 6 si M 


+ Reference Ax€$ 


Hilde Groth and John Lyman 
88 


TABLE 2 
SUMMARY OF SIGNED RANK TESTS For PAIRED COMPARISONS 
BETWEEN Any Two EXPERIMENTAL CONDITIONS 


" y 8) 
o 8 à) G) (6) 0) ( 
inters Only 7 2,5'* 
os i Responses œ +240 + 40** + 33.5%" 4 50% pub kc ph 
Completed Responses « -+82.5 +40.0** — 540* + 855 4-55.0 —107. 

2) Pointers + Perimeter 65.0 

G Correct Responses a 455.00  —113.0 + 38.5** +24.5** T + 62.0 
Completed Responses a +53.0*  — 28.0** +103.0 +58.5 = 85. 

3) Pointers + Reference Axes _ 35.5 

i Correct Responses a 25.0%* — 99.0 ee - po — n. 
Completed Responses « 15.0** — 71.0 +87.5 E 
i t: 

(4) Pointers +- Dots "m————À " 710 
Correct Responses a a oO" 4.335 + 705 
Completed Responses œ + 26.0 +20, e 

(5) Pointers + Perimeter 
+ Reference Axes s 79.0 
Correct Responses a +720 — o — 805 
Completed Responses a +96.5 — 67. 

(6) Pointers + Perimeter 
+ Dots 630 
Correct Responses a = as 1 665 
Completed Responses « — 57.5 

(7) Pointers + Reference Axes 
+ Dots 101.5 
Correct Responses o pe 
Completed Responses o T 


(8) Pointers + Reference Axes 
+ Perimeter + Dots 


* æo =59, 
P a o1 243, 


defined as +1 minut 


Setting), (b) number 
Subjects, Twenty-o 


e deviation from the actual 
of completed responses, 
ne student Ss served in the ex- 


periment, 
Procedure. The test was administered individually 
to each S. Th 


sod 
s itio? 
of difference among experimental (em nk 
Was assessed by analysis of variance by ts bY 
differences between any two mim tjon 
the Mann-Whitney U test, and the kor 
ship between the two criterion per 953): 
rank order correlation (Walker & Lev, ds did 
Because the three types of backgroun diffe 
not produce a significant performance age 
ence, the geometrical cues have been pe sis 
over all backgrounds for statistical ana y o5" 
This method has enabled us to minimize P 
EET ey 
ment No. 65 
Project, Phot 
gress; Washi 
$125 for mi 


checks payal 
Library of 


"md 
49 from ADI Auxiliary Publicat og- 
oduplication Service, Library of ance 
ngton 25, D. C., remitting in x 
crofilm or $125 for photocopies. 9^ 


R ice 
ble to: Chief, Photoduplication Serv! 
Congress, 


Í 


Usefulness of Geometrical Cues 89 


sible artifacts occurring in the distribution of 
clock-times” for a given background. 

For tabulation, it was considered to be 
more meaningful to convert the criterion 
Measures from average numbers to percent- 
ages. Table 1 shows the rank order, the per- 
centage of correct and completed responses, 
and their correlation for each of the eight 
conditions of cue configurations. 

Although the relationship between the de- 
Pendent variables was positive, it was not 
high. Speed of performance was found to be 
much less variable (maximum difference of 
about 8%) than accuracy (maximum differ- 
ence of about 17%). The significance of the 
difference between any two treatments has 
been summarized in Table 2. 

The rank order of percentage correct and 
Number of completed responses provides an 
Index of the perceptual usefulness of the em- 
Ployed cues, both singly and in combinations. 

The rank order indicates that optimum 
Performance has been obtained in terms of 
Speed and accuracy for the following condi- 
tions: 


es 


L. Pointers plus reference ax 
Jus reference 


- Pointers u: T 
ers plus pe imeter P 


3. Pointers plus perim 
The rank order of all eight dial configurations 
9r accuracy is shown in Figure 2- 


eter plus dots 


DISCUSSION 


TOU AG 
The experimental results appeat to eee 
at even though the information ee ide 

make a decision may clearly be PA ity 

Configurations of cues are P greater 
an others as aids to the 9 AES 

ance obtained with the two extreme display 

wh gurations as well as wit 
ate si inations @ 
six other combinati have been pre- 


© show a trend which COU* 
formation, 


lieve th «index O 
at an “inde er- 
Operationally defined in terms of level of P 
„mance may represen 
lSual task designs. 136 
«cu = 


Sbtimum level of 


i 3:57 
t 1:01 12:11 


Fic. 2. Performance rank order for accuracy 
for eight dial configurations. 


neither the minimum number of cues neces- 
sary for making a correct response nor is it 
the maximum number of redundant cues. 
The fact that “best performance" was ob- 
tained for a visually “clean” but not min- 
imally cued display is in agreement with good 
human engineering principles (Garvey & Mit- 
nick, 1955; McCormick, 1957). The problem 
still remains as to just what defines a “clean” 
display objectively, and no hypothesis can be 
advanced by the authors as to possible rea- 
sons for the superiority of the particular con- 
figurations used or for the functional ineffec- 
tiveness of the changes in background. Since 
number of redundant cues alone is not a 
sufficient criterion, perhaps the answer must 
be sought in figure-ground relationships in 
pattern perception rather than in S-R terms 


90 Hilde Groth and John Lyman 


Compatibility of cues within a display ap- 
pears to be of at least equal importance as 
compatibility between displays and controls. 

The stability of the obtained hierarchy of 
these cue configurations at various stages of 
training will be investigated in a series of 
long-term experiments. We hope such an 
investigation will enable us to accumulate 
enough data to permit inferences about the 
more basic variables determining the utility 
of a given configuration. 


SUMMARY 


This study was designed to define a hier- 
archy of “perceptual usefulness” of geomet- 
rical cues in an overlearned dial-reading task. 
The hypothesis was postulated that perform- 
ance is a function of “perceptual usefulness of 
cues” rather than of the amount of informa- 
tion, redundancy, and noise present in a given 
situation, 

Examination booklets Were prepared con- 
taining 24 pages with 12 “dials” of the same 
geometrical configuration on each page. A 
total number of 288 “dials” had to be read 
and recorded by each S. For each Page, 48 
seconds were allotted for completion. 

The task consisted of reading “clock-times” 
on these “dials.” It was se 
fulfilled the requi 


background were combined in a factorial M 
sign with 12 replications of different porte 
settings for each of the 24 combinations. : 
test was administered individually in a ie 
ment by Ss counterbalanced design. Twenty 
one student Ss served in the experiment. "Á 
Results supported the hypothesis, an E 
rank order of “perceptual cue-utility disi 
found. The implications of the results deer 
design have been discussed, and m 
and generality of the findings at various pir 
of training will be investigated in fur 
studies. 
REFERENCES a 
ni- 
Anperson, Nancy S., & Leonarp, J. A. The v 
lion, naming, and reconstruction of visual d 
as a function of contour redundancy. J- 
Psychol., 1958, 56, 262-270. ci. 
Bruner, J. S. On perceptual readiness, Psychol. Revs 
1957, 64, 123-152. "o 
CHERRY, C. On human communication: A gu 
survey, and a criticism. New York: Wiley, ^ addi- 
Garvey, W. D, & Mitnick, L. L. Effect 0 1 effi 
tional spatial references on display conva 
ciency. J. exp. Psychol., 1955, 50, 276-297; york: 
McCormick, E. J. Human engineering. New 
McGraw-Hill, 1957. -— dis- 
Rarraponr, M. The role of redundancy in i 195^ 
crimination of visual forms. J. exp. Psychol 
53, 3-10. — 
SENDERS, VIRGINIA L., & Conen, J. Effects er- 
quential dependencies on instrument-reading 
formance. J. exp. Psychol., 1955, 50, 66-74. m 
Warker, Heren M, & Lev, J. Statistical infe 
New York: Holt, 1953. 


(Received April 20, 1960) 


Journal of Applied Psychol 
1961, Vol. 45, No. 2751-06" 


THE EFFECTS OF PERSONNEL REPLACEMENT 
ON AN INFORMATION-PROCESSING CREW 


MILES S. R 


OGERS, JOHN D. FORD, Jr, anv JACK A. TASSONE 


System Development Corporation 


Almost all human organizations are subject 
to changing membership throughout the 
course of their existence. Problems of labor 
turnover have been dealt with in studies by 
Kangan (1948), March and Simon (1958). 


and Rice (1958). Problems of the effects of 


Changing membership have also been the 
focus of small group studies, Simmel (1955), 
Mills (1957), and Borgatta and Bales (1953)- 
Both from the industrial approach and the 
small group approach, there is agreement that 
intact groups are desirable and that anything 
that tends to destroy group membership in- 
tegrity is likely to degrade the group's per- 
formance. 

Quite the opposite point of view appears to 
€ held by man-machine system designers who 
Seem to treat individuals as interchangeable 
Units over a wide range of behavior. AS 
Forgays and Levy (1957) point out in the 
t of a military system; 

je indivi i ili olicy has many 
eem pope perm 
not Lining and unit performance in 
idv a commander may be ce 
Performance oF get om decrements connote, ^. 
‘sirable flexibility. The concept of crew aoa 
however, implies that a unit is à unique agmi En 
ice Will suffer in performance if membership € ang 
ccur (p, 1). 


ee and Levy go 9 
is of data collected 0? r 
m Aher crews, that, in general, 
edium number of memb 
acer combat performance 
8h- or low-change crews: 
o Eisher (1917) formula 
hu in terms of the 
5 Uction while replacemen > 
n achieve the production Jevels : 
Su ced men they replaced. Kanga" section of à 
"BEested that turnover 1? one § 


€ctio ight effect outpu 
"ed ; specially where the WO k was organ 


e A 
by the “chain” metho” m 


More recently, Duncan (1955) reported an 
index called “skill dilution" which can be 
used to measure the effect of labor turnover 
on the “pool” of skill available at any time. 

The effect of turnover on an information- 

rocessing system, according to the concept of 
skill dilution, ought to be fewer items of in- 
formation correctly processed when compared 
with a similar system without turnover. Fur- 
thermore, since many information-processing 
systems are organized by the “chain” method 
(ie., information enters the system at one 
place, is processed and passed on, reprocessed 
and passed on again, etc.), the degradation in 
performance should be reflected in the system 
output unless the other members of the sys- 
tem can develop ways of overcoming the 
“hottleneck” effect of the new member. 

The present study was undertaken as a 
preliminary step in the experimental investi- 
gation of this area where membership was 
deliberately manipulated as an independent 
variable. Since this study was undertaken as 
part of a larger research program directed at 
improving the Air Defense Command’s Sys- 
tem Training Program, the questions posed 
were specific to information-processing sys- 
tems and training. Specifically, the questions 
were: 

1. What effect does the replacement of per- 
sonnel in à complex information-processing 
system have on the ability of the system to 
accomplish its mission? 

2. To what extent do the concepts derived 
from the analysis of labor turnover account 
for the turnover effects in information-proc- 
essing systems? 

3. To what extent can the expected degra- 
dation in performance be overcome by various 
training methods? 


This study was intended to get preliminary 
answers to the above questions. Although 
1See Goodwin (1957) for a description of this 


program. 


92 


several independent variables were intro- 
duced, only those combinations of variables 
which seemed, a priori, to be potentially most 
informative were investigated. Therefore, the 
resulting “experimental design” was incom- 
plete. Nevertheless, the results seemed to 
have important implications for man-machine 
information-processing systems, Consequently, 
this study has been reported in its present 
form despite the fact that not all combina- 
tions of all variables were studied and there 
was little or no replication for those combina- 
tions that were investigated. 


PROCEDURE 


The seven-m 
in this Study was a sim 


scopes” 
ved from 


System Description 
The white dots proj t 
against a black rectangular g ap ine 


1n sequence were referred to as “poj. 


M. S. Rogers, J. D. Ford, Jr., and J. A. Tassone 


number, and time on the plastic screen alongside th 
trail of blips. On subsequent frames he updated th 
Position of the track with appropriate symbols. 
This track information was transmitted through F 
series (or chain) of data-processing positions as Es 
lows: The Scope teller read the information whi a 
had been affixed on the input screen by the e 
reader and transmitted it over voice phone to ma 
conversion plotter who was plotting in Cartes! 


oan 
coordinates on a large plotting board, whereupon th 


conversion teller converted the information E 
polar coordinates and transmitted by voice phon 
a second remote plotting location. There the Paii 
plotter plotted the information on a polar coo 
nates board. forna 
The track correlator was given auxiliary Er. 
tion concerning the programed flight plans t the 
tain tracks. When the positions of the track an ithin 
flight plan information were sufficiently close (Wi ae 
15 miles and 2 minutes), the track correlate ME 
clared the track to be “known.” Otherwise, the i5 
Was declared "unknown." This information other 
given to the supervisor who passed it on to the ain- 
crewmen so that they would not continue to m4 
tain “known” tracks. ation 
Each experimental session involved the openi (a 
of the laboratory system by a group of subjeci met 
crew) for 40 minutes, Following this, the Cren 
in a separate room where they were provided were 
formal knowledge of performance results an 
allowed to discuss these results as a group. crew 
Turnover was introduced by replacing one " 
member prior to the next experimental sessio rig 
different position was "turned over" each time. a 
turnover rate resulted in complete turnover of adition 
at the end of eight sessions. The control i iv- 
consisted of crews operating the system under Eo 2 
alent conditions without replacement of pers other 
These stable crews were labeled “A” crews, and 1. 
ctews were turnover crews (B, C, D)—see Ta raced 
Different turnover crews had their personnel reP™ rgs 
in different order, However, the order difference 


5 vi 
not confounded with the other experimental 
ables, 


ri- 


TABLE 1 


CHARACTERISTICS OF Crews 


Stable Crews 


Turnover Crews 
Training Initi i ini Tnitial Skill of Crew 
aini: nitial Skill T itial Ski if 
Cut i raining Replacement — Initial Skill of Crev 
crion High Low Criterion P skal igh Tow 
Trials As A Trials ws P 
Low Ci, De Bi 
Hi 
Performance A Performance 
Low C, Ds p» 
a Condition Broducing Most skill dilution, 


g least skil] dilution, i 


Re placement in an Information-Processing Crew 93 


System Experience 


In 
be Meta the "skill dilution" concept there must 
ame i q skill to be diluted by turnover, if 
flay atoms cx a have an effect. Therefore, the amount 
Tew. was sso ay of the original members of each 
Hist (there ined. for different crews. Table 1 shows 
Eer Mi two stable crews (A, Ai) and two 
NO System as (B, Bj) whose original members had 
Bthigrecresre net when they were formed. All 
members oe 2) ©, D, Cn, Da, BJ had original crew 
MO system experience ranging from 1-15 
variable, Fd B; and As were matched on this 
lBssions afc e all original members had had eight 
bets of tha ystem experience prior to becoming mem- 
Sae uy crews. The other four (C, D, Ca DJ) 
fange of s y matched in that all had a similar wide 
ment; ystem experience in their jnitial comple- 
In a simi 
of He prd way, the amount of system experience 
varied p for the turnover crews was 
Crew B, a "i crew, Bs. Its replacements came from 
Perienced in were, therefore, variously system ex- 
crews (B = the same position. ‘All other turnover 
ad no a E C, D, € D) had replacements who 
Ported f evious system experience when they re- 
or work with the crew. 


Trainin g 


All 1 
i Subjects had some form of individual instruc- 
embers of a 


tion b 
efo ; * ENS 
re assuming their positions as mí 


ent of Crews An, ^h 2, 


instruction was pro- 


Perie; 

; e act as the original crew. mem 
individu: Ei into D;. These individuals © 
is ne instruction from the person 
Session struction occurred during the 

eir q just prior to the one where t 

A p 
Videq p erent individual training 
Crews he the original and replace 
individual B, C, and D. AI] subjects 
tactice ally and were required to Per 
errors problem with less than a certain nun 
Tepetith, er run (specific to each po ition) within three 
trained to of the problem. Thus, €% individual was 
i to a criterion of performance before joining 
i dure resulted in 


ls cr 
ew. The former training proce 
xperience, the latter re- 
e proficiency. 


equ 
Hl indict 
8 A individual training CF 
aoe equal individual performance n 

r each run, all crews had an opportunity to 
n an attempt to 
the range 


hey assumed 


procedure was pro- 
ent members © 
were trained 


?Thi 
his treatment was introduced 1 
mber of crews, 


Ver, wi 
En a minimum DU 
ation of training methods. 


discuss their performance i 
c : n a face- 
ic top 45 minutes. This pe SS 
€ es pes except Di, which P "sem 
10 in viue -exercise and 35 minutes pre ar 
eru Le (see Footnote 2). RI 
aa des reported to all crews. Thi € 
E up the number of tracks rane 
ved s e system during the trial T ics 
— pe ened es the Air Defense Com Ee 
€ breed sqm and previous lab se 
epis e at more performance ed 
poe ie e expected if such system perfo: EE 
mu ere supplemented with detailed i aaa 
beds diee ga of each crew Sees 
a v. Es method was used with Crew D. ko 
e eile ose performance summai Ei 
pe ee Ds : knowledge of results. Finall Ww 
— al position-oriented knowledge d 
eus diss euer by written em de s 
ce ae ors which would focus the s eee 
ucc sone E aspects of his bap Tos 
Meg ce x ke improvement. Thus, in x 
eA aes m of the errors each man made ( been 
oe d khowledge of result) is 
pap mon ed with problem-focused in m 
sy Hs A s were given a written statement pu 
5 dns o parts of his task that he T aE 
CE rst. This method was used with us 
s kin ng H result, had the same kind of feed’ d 
oed, AE lus written constructive criticism. pa 
B vo m s pec e the variables EROS i 
esa n w D; replicated Crew Cı and Gra D 
esa Jd C. However, these crews differed E 
the nite hs psi knowledge of results each oe ED 
ee Y had a minimal amount. Cr ae 
hearer oe mplex knowledge of results and fae 2 
ost complex feedback given to pua s. 
crew. 


Data Collected 
Each position in the sy: 

least one observer Ic e Ec by at 
ae 15d also monitored the SD tal corr RS. 
m de ~ from the position which cn 
output Or x pdt input to the position EM 
d niea e position was recorded. e mm 
ees E was obtained of the perform: TE 
5 i of each of its members mic 
supervisor. This statistic was the ratio of except the 
of task units (aircraft tracks) proceed ay tent i 
i ampe one This measure, Piles ii num- 
ae [S e the performance effectiven iplied by 
position) and was reported = ne 
a per- 


centage. 
RESULTS AND Discussion 


The results of this stud 
y ar indu 
ures 1, 2, and 3. The ade iU Fig- 
ness of each crew has been plott e ective- 
AP WW session. The crews Bd dud 
BOT = 4 x a A,, appeared to eee 
B and B Th i io of the turnover crews, 
,. The other five turnover cte. inar 
indi- 


94 M. S. Rogers, J. D. Ford, Jr., and J. A. Tassone 


Syaten Performance Effectiveness (per cent) 
yaten 


2 3 


Experimental Sezstone 


| 
4 5 6 T è 9 
Fic. 1. Performance of inexperienced crews, 


cated either a failure, to improve or even a 
decline in performance effectiveness with sim- 
ilar experience. These results suggest the 
anticipated answer to the first question posed 
by this study. The effect of crew turnover on 
a complex information-processing system was, 
for some of the experimental conditions, to 
degrade its ability to perform its mission. 


100 


tem Performance Effectiveness (per cent) 
System 


Experimental Se. 


Syaten Terforcance Effectivences (par cent) 


2 2 3 D 5 REL EE 


Experimental Sessions 


Fic. 2. Performance of experienced crews. 


However, the amount of js dier p 
peared to depend on the amount of ski 
pletion produced by the turnover. e 
Turnover crews with no initial emu 
perience, B and B,, performed at least b 
and seemed to improve as much with pra see 
as comparable stable Crews A and A i ma 
Figure 1). This condition represents ml 


SSions 


Replacement in an Information-Processing Crew 95 


skill dilution because the B and B, crews had 
no original pool of skill. The rate of retarda- 
tion in the accumulation of system experience 
by S these crews did not appear to affect their 
ability to improve with practice. 

The most skill dilution in this study 
curred in crews originally composed of men 
with system experience and whose replace- 
ments were men who had only individual 
training prior to joining the crew. This con- 
dition was represented by turnover Crew C, 
C,, D, and D, (see Figures 2 and 3). All four 
Crews fall short of the performance expected 
of stable crews with comparable experience. 
Crew A, was designed as à control for Crews 
C. and D,. Unfortunately, unanticipated 
turnover forced this crew to terminate short 
9f the number of experimental sessions neces- 
Sary for this comparison. The theoretical 
Stable crew performance curves shown in 
Figures 2 and 3 were used to evaluate the 
effect of this level of skill dilution in lieu of 
Comparable experimental data. (The deriva- 
tion of these theoretical stable crews perform- 
ance curves will be discussed in à later 
Section.) 
one the possible excep 

ults from Crew D, these data 


erable de eir: a 
Wi gradation in performa 
With a kind of turnover which produced a 


hod amount of skill dilution. (Comper 
ews C, and D, with A. Also compa 

1 2 
the D with the theoretical stable crews. P 
lat Crew B, was intermediate in bot E in 
‘lution and performance c6 radation. 


View of thi i econd question 
this analysis, the $ d 
Dose 3 : rtly answered. 
d for this study has been epe be used 


€ concept of “skill dilution s à 
m account for the direction an relative DOE 
rae of the effect of turnover on an M 

"Processing system. : 
he industria] and the small ow gern 
dhis area suggested that e elect Oi age 
bl €T ought to extend beyond the pe anization 
wiced, especially in a chain 2e ore ut from 
ere the input to one man 7 heor ed t0 
nother, The following method was Sae 
ermine whether turnover ! oe ae aval 
associated with à deviation rom 
cormance in another position. 
si Dne-halt of the data, "amea 
5, were used to obtain 4 
Performance improve™® 


oc- 


tion of some of the 
show consid- 
ce associated 


for each position. A negatively accelerated 
exponential curve was fitted to these data by 
the method of least squares. Checks for inter- 
nal consistency were made by using the other 
half of the data, as follows: (a) The number 
of experimental sessions (experience) accumu- 
lated by each crew member on each day was 
entered in the performance curve for his 
position. This resulted in a predicted perform- 
ance score for each of the six information- 
processing positions in the system. (b) The 
product of these six scores yielded the the- 
oretical system performance score for the day. 
The rank-order correlation between these 
theoretical scores and the observed perform- 
ance of the stable crews was +.89 (p < .01). 
The size of this correlation justified the use 
of these performance equations to predict 
stable crew performance for experience ranges 
beyond those sampled experimentally (see the 
theoretical stable curves, Figures 2 and 3). 

The difference between expected and actual 
performance of each member of the crew was 
calculated for each session. These differences 
were arranged according to the position 
turned over. If the turnover of one position 
interacted on another position, the perform- 
ance deviations for the affected position 
should be a sample from a different popula- 
tion than other performance deviations for 
the given position. This sample difference was 
analyzed by the Mann-Whitney U test. The 
results showed four interactions significant 
with p < .05. One of these, the effect of the 
conversion plotter on the track correlator, was 
thought due to the method of measuring per- 
formance effectiveness. The other three inter- 
actions were: A new polar plotter reduced 
performance of the conversion teller, a new 
conversion plotter reduced the performance of 
the scope teller, and a new scope teller re- 
duced performance of the scope reader. These 
interactions have the following common prop- 
erties: (4) the degraded position is closer to 
the input of the system; (b) there is a tele- 
hone link between the affected positions; 
(c) the degraded position cannot readily 
store information and distribute the load over 
time, he becomes overloaded by having 
greater demands on his memory storage. 

The third question posed for this study 
was “to what extent can the expected degra- 
dation in performance be overcome by various 


96 


training methods?" This question remains un- 
answered. None of the methods tried in this 
study resulted in adequate performance im- 
provement with practice by crews subjected 
to turnover resulting in gross skill depletion. 
The very best that was achieved was in the 
case of Crew D which for 10 out of 13 ses- 
sions performed at levels equal to or better 
than its performance prior to turnover. The 
training of this crew included individual 
training to a criterion, and knowledge of re- 
sults consisting of a System summary, posi- 


tion-oriented feedback and problem-focused 
written critiques, 


Finally, 
some addit 
late that i 


M. S. Rogers, J. D. Ford, Jr., and J. A. Tassone 


tion, crew cohesion, crew standards of d 
ceptable performance, and others deriv ; 
from sociopsychological research may accoun 
for the behavior of organizations subjected t0 
deliberate membership manipulation. 


SUMMARY 


The data indicated that, under certain on 
ditions, crew turnover degraded the ability a 
a complex information-processing system 
accomplish its mission. Whenever turnover at 
sulted in little "skill dilution" the perfor 1 
ance of the system was not greatly ate 
Whenever the turnover resulted in cane 
able “skill dilution” the system either i 5 
to improve or declined in performance 0v? pe 
period of training sessions. In a chain- T 
organization the effect of turnover was " 
to extend beyond the position replaces 0 
degradation was found in the performance i 
the position passing information to the 
placement whenever the passing posito ties 
inadequate information storage A E T 
None of the training methods tried n the 
study was very effective in counteracting 
effects of turnover. 


REFERENCES nili 
Borcarra, E. F., & Bates, R. F. Task and au of. 
tion of experience as factors in the interac 
small groups. Sociometry, 1953, 16, 239-252. Jabo” 
Duncan, D. C. A new method of recording 
losses, Manager, 1955, 23, 30-35. novel 
Fisner, B, Determining the cost of labor tu 60-66 
US Bur. Labor Statist. Bull., 1917, No. 22^ mance 
Forcays, D. G., & Levy, B. I. Combat perfor s 
characteristics associated with changes in the 
bership of medium-bomber, crews, USAF du 
Train. Res. Cent. res. Rep., 1957, No. 57-14 pora 
Goopwris, W. R. The System Development Cor $7, 
tion and System training. Amer, Psychologists 
12, 524-528, ew 
NGAN, M. The cost of labor turnover: A revi gunel 
the literature. Bull. industr. Psychol. pers 
Pract., Melbourne, 1948, 4(1), 12-27. New 
Marcu, J. G., & Smon, H. A. Organizations. 
York: Wiley, 1958. š 
Muts, T. Group structure and the newcomer: 


experimental study of group expansion. Oslo: 
Univer, Press, 1957, y tion! 
Rice, A. K. Productivity and social organiza 


ods 
The Ahmedabad experiment. London: Travist 
1958. 


son 


An 
osl? 


cial 
Somer, G, The Significance of numbers for 59 


" les 
life. In P. Hare, E, F. Borgata, & R. F. Ba! 


55. 
(Eds), Small groups. New York: Knopf, !? 
Pp. 9-15, 


(Received May 6, 1960) 


Journal of Appli 
of Applied Psychology 
1961, Vol. 45, No. Tur 


THE REACTION OF INTERVIEWERS TO FAVORABLE 
AND UNFAVORABLE INFORMATION 


B. I. BOLSTER axp B. M. SPRINGBETT 


University of Manitoba 


pon findings (Bloom & Brundage, 
hr i Crissy & Regan, 1951; Newman, Bob- 
stuc & Cameron, 1946; Springbett, 1958) in 
udies of the employment interview have sug- 
x that interviewers react more strongly 
unfavorable information about the appli- 
Cant than they do to favorable information. 
| be addition, evidence of primacy effects has 
been produced (Springbett, 1958) ie, the 
Valuation attached to an item of information 
ed first carries more weight than if it 
Presented later. 

t his experiment is designed to achieve 
Tee ends: (a) a direct, systematic and, 
ae “purer” test of what has been in- 
a entally discovered and reported concerning 

| to stable and unfavorable information; ^ 
à „identify some of the variables governing 
gd effects; and (c) to make à rough 
eck on the assumption of a “negative set 


Died /to expl h Springbett 
1958) explain previous results (Spring5e*5 


riables associate 


th ; f the applicant. 
e physical presence O the importance 


her th N 4 
ere is evidence O ee 
tio, Mequacy of written verbal E 
be in the interview (Geldi, 19515 "P 
4, 1958). In addition some 
ba t or [besar o favorable and 
dip erable information is r 
patia] interviewer rating 
ey Duted to favorability ©" 
e. 
Pn es opla vi 
Shown. In an attempt to y 
acy effects based On jud ms es = 
ap ance were less than those base m 
Plication form it was suggested (SP 


bett, 1958) that in committing himself to a 
highly favorable rating on appearance the 
interviewer felt he was committing himself to 
a risk and consequently became more sensi- 
tive to negative information. One might argue 
on this basis that, more generally, deviation 
from the noncommittal attitude represents 
"risk" and that the further the interviewer 
deviates from this base, in either a favorable 
or unfavorable direction, the stronger be- 
comes the tendency to regress to neutrality. 
This assumption may be tested by inducing 
interviewers to build up various degrees of 
acceptance or rejection and then introducing 
information of a contrary nature. 

3. If, indeed, there is a “set” to find and 
favor negative evidence in the interview, in- 
dividual differences in the strength of the set 
may be expected. If such exist, those with a 
high degree of the set will tend to place high 
ratings on unfavorable information and rela- 
tively low ratings on the favorable. Those 


‘with a lesser degree of the set would place 


lower ratings on the unfavorable and rela- 
tively higher ones on the favorable. If judges 
were ranked from high to low on "negative 
set," their ratings on an item of unfavorable 
information would be distributed from high 
to low, while their ratings on favorable in- 
formation would run from low to high, i.e., 
there would be a negative correlation between 
ratings on favorable and unfavorable infor- 


mation. 
METHODS AND PROCEDURES 


The setting in which the data were obtained was 
that of assessing the suitability of applicants for uni- 
versity contingents of the Canadian Army Officer 
Training Corps. This offered a number of advan- 
tages: first, there was a large number of highly ex- 
perienced personnel officers available (see Webster 
1959, p. 16) operating in relation to a common ster- 
eotype as to what constitutes a good officer (see 
Sydiaha, 1959) and, in addition, extensive files of 
interview reports provided a realistic source of per- 
sonal data for the construction of experimental 


materials. 


98 B. I. Bolster and B. M. Springbett 


As indicated above, information was to be com- 
municated by the printed word in the experimental 
situation. The information had to be realistically 
related to the interviewers’ task and scaled with re- 
spect to its importance in decision making. Conse- 
quently, the details of method and procedure fall 
into two parts, that concerned with the construction 
of written protocols and the rating form, and that 
concerned with the experiment proper. 


Construction of Protocols and Rating Form 


Selection and scaling of items. Over 200 officer 
cadet selection reports were combed to secure 150 
items of relevant information evenly divided between 
favorable and unfavorable. These were edited for 
clarity, ambiguity, and to the end that each presented 
a single item of information, 

These were arranged in two lists, favorable and 
unfavorable, and a preliminary ordering as to ap- 
parent intensity (importance) followed. Some item 
construction and reconstruction was necessary (a) to 
fill in gaps in apparent intensity, (b) to have for 
each favorable item a counterpart of the same in- 
tensity in the unfavorable list, (c) to provide a sam- 
ple of items from each major area explored in the 


Systematic officer cadet selection interview. A total of 
100 items remained, 


two piles— 
gory, “can’t 
were later 
orable and unfavor- 
alue groups separated 
vals of importance in mak- 


orable. A third cate 
and items so classified 
; each pile (fay 
nto seven v. 


only those items on which 
vas perfect, or nearly so, wi 


ere 
ed the number of items 


in each 
category to 30, 

The next operatio; 
volved 25 full-time 
intermediate judges were excluded. 

Each group of 30 items was put into a modified 
pair-comparison format. In this format each item is 
compared with every other item in the immediately 


n was conducted by mail. It in- 


army personnel officers; the five 


i] 
adjacent group (see Ferguson 1952, p. 309). ae 
these results scale values were calculated using T 3 
stone's Law of Comparative Judgment—Case va 
sumption. (For these procedures see Guilford, The 
Ch. 7; the statistical procedures are on p. 170.) ale 
obtained scale values were transformed to a b 
with a mean of 10 and a standard deviation Y 

Construction of protocols. Using these 60 ae 
and item values 12 protocols of interview infor 
tion were prepared, each containing 10 items. mM 

Two of them (A and B) each contained 10 alim 
able items graduated from low to high scale ke con 
Two (C and D) were similar except that they va 
tained unfavorable items. The total scale bust 
each of these protocols were equal, within on 
of 166. S, 

"n remaining eight protocols (M, N, P, a a 
X, Y) are called divided protocols. In each P iie or 
the first five items are of one category gravor E 
unfavorable) the last five are in the oppo ed a 
ezory. Each group of five items may be Meum fot 
either high or low scale value. Total scale val js true 
each high value group are equal, and the same value 
of the low value groups. Table 1 shows the 
composition of each protocol. sus. off He 

Construction of the rating form. In View rimat 
speculations in the introduction concerning P' orm?” 
and set, it was desired to have the protocol a 4 
tion presented against a background of an pem 
Three "sets" were employed: (a) a set of E; aina 
ance”—this is achieved by giving the par an 
information that the hypothetical candidate a cet 
M score of 170 (175 is the approximate peat al 
score), (b) a set of “rejection” produced by val and 
ing an M score of 155 (a score of 160 is minim. s 
is waived only in the face of exceptional epu as 
ing qualities), (c) a “neutral” set produced e in 
signing an M score of 160. Interviewers W m 
Structed to start from the departure points of a 
reject, or neutral as these scores were assigned: spt- 


n 


Cutting horizontally across these are 10 lines 
for each item in a protocol); these lines are T ight 
off in intervals. The reject and accept points Ji the 
intervals to the left and right, respectively» al” rm 
neutral point. Extending beyond the "minim": ter- 
ject and accept points are an additional eight. 
vals to indicate degree of rejection or accept" 


TABLE 1 


Composition or TWELVE 


PRotocots or Interview In 


FORMATION 


Protocols si 
Ttems A B C D N M x Y E Q F 
lto 5 $ + — - cn $ = ++ + ++ = A 
6 to 10 ++ ++ -— -= + — +4 = = =- 4: ut 
Note.—+ and — indicate favorable and unfavorable information, 


je? 
- — 
Single and double signs indicate low and high scale 


Reaction to Favorable and Unfavorable Information 99 


The Experiment Proper 


The subjects (interviewers) were given a rating 
form and an M score fixing their first point of de- 
Parture. Each had a protocol enclosed in a large 
Open-end envelope. The subjects then exposed the 
first item of the protocol and checked the rating 
scale, on the line for that item, to show the shift, if 
any, in their evaluation of the hypothetical candidate 
induced by that item. This procedure continued for 
al 10 items of the protocol. 

Sixteen raters were randomly divided into four 
groups of four. The protocols were rated in three 
Sessions of four protocols each. In each session each 
Of the four subgroups received the protocols in dif- 
ferent orders so that possible series effects were 
counterbalanced by way of a four-by-four latin 
Square, x 
fi The 16 subjects were army (militia) personnel of- 
icers, All had previous training in the principles of 
Psychology and their application to selection. All had 
Previous experience in the selection of army officer 
cadets. Eight had two or more university degrees, 
Cur held one degree, and the remaining four had 
Matriculation plus some higher formal education. 
Be had taken part in, ny information 
about the preliminary phases of tù (T. 
ae may be noted that a preliminary pilot hc ls 
ë Procedures was carried out with 30 n s 
nsure smoothness of operation, the suitability E 
materials, and mode of presentation, and that à e 
nificant difference in ratings Was produced by 
Protocols, 


RESULTS 


The care taken in the construction of he 
Protocols arose out of the concern that they 


: igned 
€ realisti weightings assigne 
stic and that the weg the sense that 


9 the items would be valid in Pe of rat- 
‘Ney would elicit proportionate shifts © 
ng in the experimental situation. This apie 
Checked by correlating the weights OF | d 
Items with the amount of shift they — 5 
As will be seen below there are om » 
"Imacy effects in that the first ids Ham 
i Totocols A, B, C, and D and the aum in 
ta the remaining protoco’s produce uU 
tings far greater than § 
able Sixth item is where 
5 to unfavorable inform 
Curs, 


tion, 0 


The ¢ i etween the 75 
ese pad e rating shifts 1S 
< 05). 

The correlation betwee? 
'Bhts of Items 2 to 
dne shifts is .820 (P < Pi rating shift for 
he ratio of item wei 


Ww 
r 


the “primacy” items is 2:1, i.e., two units of 
item weight elicit one unit of rating shift. The 
corresponding ratio for the remaining items 


is 6:1. 


Reaction to Favorable and Unjavorable 
Information 


The summary of results is shown in Figures 
1, 2, 3. In Figure 1 the results of Protocol A 
(favorable) and C (unfavorable) are shown. 
Procotol A starts from a departure point of 
“reject” (M score of 155). It will be noted 
that it takes on the average 8.8 items to shift 
the ratings from reject to accept. Protocol C 
starts from a rating of accept (M score of 
170). The shift from accept to reject requires, 
on the average, only 3.8 items. At the end of 
10 items Protocol A has induced a total rat- 
ing shift of 17.5 units, Protocol C, 22.1 units; 

= 12.67 (p < .001). 

In Figure 2 the results of Protocols B and 
D are shown. In this case Protocol B (favor- 
able) starts from a rating of accept and D 
from reject, i.e., the information confirms the 
direction of the initial set. Here, eight units 
of scale were available to the rater to indicate 
an increasing degree of acceptance or rejec- 
tion. By the end of nine items 13 raters using 


Legend: -777 unfavorable information 
is | —— favorable information 
ee C 
Q- 


Ratings 


I 
"Reject'(A) or "Accept" (C). start line 
m E 


3 4 5 6 7 8 9 1 
Items 


Fic. 1. Rating curves on two basic protocols (A 
and C) of selection information. (Mean of 16 raters 
Units noted on each curve represent the total item 
weight of the designated items.) 


100 
Legend: ---- unfavorable information 
——— favorable information 
seb 
Maximum possible rating 
oon o—0D 
ú 226p gn 
© MU o—98B 
S 927 s 
itp -279 oo et 
5 zo 41095. — 
CS H uu QA 
ze Za cor 
a LZ o — 0 “Recep 16) or Rejec tol stort tine 


' 2 3 4 5 6 T 8 9 70 
Items 
Fic. 2. Rating curves on two basic protocols (B 
and D) of selection information. (Mean of 16 raters. 


Units noted on each curve represent the total item 
weight of the designated items.) 


Protocol D reached the maximum point, while 
only 6 of the raters using Protocol B had done 
so. The total amount of shift induced by 
Protocol D is 6.9 units and 5.8 units for B; 
t=7.10 (p< 01). 

Figure 3 summarizes the data derived from 
the eight divided protocols. In half the proto- 


unfavorable information 
—fovorable information 


$ 
* 
S o Sy 10 Neutral 
o N ©. 
geb No Cs 
xNLLNHRRHXES 
t4 PSU 
` 
te `o 
M*Q*Y«P 
te 


Accept or Reject 


B. i. Bolster and B. M. Springbett 


cols the first five items were favorable, in the 
other half unfavorable. The point of depar- 
ture for the raters was a neutral rating (M 
score of 160). At the end of the fifth item the 
positive protocols elicited an average rating 
shift of 9.5 units, the negative protocols 10.8 
units; £ = 5.13 (p < .001)., 

The second halt (last five items) of e 
protocols were of contrary sign to the firs 
half. The shift in rating induced by the pos 
tive items was 12.3 units, by the negative a 
units; 2 — 3.33 (p< .01). Interpretation © 
these results is complicated by the fact be 
the level of acceptance or rejection at b. 
the final five items were introduced varied d 
virtue of being dependent upon the e 
reached in the first five items (see below 
With this factor controlled the difference 
would be greater than those obtained. 


Primacy Effects 


Following the line of thought indicated " i 
the brief introductory remarks, the first pe 
cern is to determine whether primacy re 
not simply to the first item in a series E 
rather to the first item that challenges 
existing set. Second, it was suggested ; 
primacy effects so defined would increas at 
a function of the strength of the set it oat 
lenged, i.e., the higher the rating at the P t 


of challenge the greater the primacy ef or 
Another factor may be the weight or na 
tance of the item that challenges the exi ac 
set, i.e., the question is whether the beet 
effects of the lightweight and heavywor p 
items, as shown in rating shifts, are disP 
portionate to their weight. mue 
Primacy related to shift of direction in ad 
evidence. The clearest demonstration tbe ind 
item has disproportionate effects in sbi!’ ó 
ratings when it challenges an existing set ! 
be found in a comparison of Figures 1 a” tive 
In Figure 1 both the positive and nee 
protocols challenge a set based on M sCÓ ges 
In Figure 2 the protocols do not chalet 
but, on the contrary, confirm the sets b# 
on M scores. 


€ 

The weights of the first items for A and j. 

are 6.5 and 6.7; for B and D, 6.5 and", 
The shift in ratings due to the first item 

A and C are 3.7 and 8.8, respectively; 10 d 


and D the corresponding figures are 0.2 


Reaction to Favorable and Unfavorable Injormation 


eon the average the shift induced by a 
in of direction in the evidence is approx- 
qued 25 times as great as that when the 
P tion of the existing set is confirmed. 
a ang asa function of the height of the 
Tem ic ratings. In the divided protocols, 
directi challenges a “neutral” set either in the 
end Fr of acceptance or rejection. At the 
tej of Item 5 high ratings of acceptance or 
icm have been induced. Item 6 then 
ie Tt nges the ratings established at Item 5, 
ow em 1 has to produce its effects against a 
un Item 6 against a high rating. 
rone 2 shows the ratio of rating shift to 
Bio aht for Items 1 and 6 in each of the 
1; ed protocols. The average ratio for Item 
(pi fib for Tem 6 it is 637; t 2145 
< .001). 
ii lines of evidence in 
of ve items show consistently that the rate 
return to the “neutral” line is à function 0 
lind height of the rating which is challenged 
ine that this rate decelerates aS the neutra 
5,5 approached. 
ibis SD as a function of it 
Weigh 3 shows the ratio of rating 5%. cha 
item t for low value items and for hig " s 
5 S. It will be noted that the low an g 
es items are equally distributed between 
0 ms 1 and 6. The results show that the unit 
i tating shift per unit of item weight is sig- 
= 3 antly greater for high value 
37 (p < 01). 


volving all the 


em weight. 
hift to item 


RATING Sur INDUCE 
IN tae Divipep P 


ating shift: Ite 


Rae 
95 


M 36 " 
5 T 66 
- e ‘92 
Q 50 m 
i E 169 
S 48 v 
> A ‘58 
y .50 * 
leans Ail .637 
"dard deviations 076 Jes 
Notet 135, 


101 


TABLE 3 
Errect OF RELATIVE WEIGHT or ITEMS ON RATING 
Suet: BASED ON ITEMS 1 AND 6 
or DIVIDED PROTOCOLS 


Rating Shift: 


Item Weight 
Low High 
Value Value 
Protocol Item Items? Items? 
M 1 36 
6 ES 
N 1 51 
6 4 
P 1 .28 
6 .66 
Q 1 -50 
6 92 
R 1 34 
6 30 
S 1 AS 
6 69 
x 1 37 
6 .58 
¥ 1 .50 
6 «66 
Means 447 .606 
A74 .140 


Standard deviations 


Note.—t =3.37. 
a Low value items range 
b High value items range 


from 6,5 to 9.1 units. 
from 10.9 to 13.4 units. 


Correlation of Ratings Based on Favorable 
and Unfavorable Information 


As indicated in the introduction, individual 
differences in negative set, should result in a 
negative correlation between the magnitudes 
of rating shift induced by favorable and un- 
favorable information. 

Table 4 shows three sets of correlations for 
each of the divided protocols: (a) between 
Items 1 and 6 which are of opposite sign, 
(b) between the cumulative rating at Item 5 
and the rating at Item 6 (again of opposite 
sign); (c) between the cumulative ratings at 
the end of Item 5 and that at Item 10. (Each 
cumulative rating represents the terminal 
point of a five-item series and within each 

rotocol the two series are of opposite sign.) 

All but four of the correlations are positive 


102 


B. I. Bolster and B. M. Springbett 


TABLE 4 
3 VG BY EEN RATERS 
INTERCORRELATIONS (7) BETWEEN SHIFTS OF RATING BY SIXTEEN RATE 
AT CRITICAL Ports or Ercur DIVIDED PROTOCOLS 


Description of 


Rating Shift Source 
z 5 ) 
Protocol Negative? Positive Item 1 vs. 6 Item 5 vs. 6 Item 5 vs. 10 
N High^ Low 612 356 € 
R Low Low —.144 —.605 bo 
x Low High 285 420 p^ 
S High High .833 A82 ‘ 
Mean 396 .063 424 
Positive Negative 
M Low High .299 .600 pe 
P Low Low 544 .334 AT 
Q High High 145 .245 i50 
Y High Low 3251 —.276 —.258 
Mean 400 364 342 
Over-all Mean 428 213 383 
? Negative-Positive indicates whether the information was 
b The ter 


favorable or unfavorable, 
ms High-Low refer to the weight of the informati 


ion correlated. 


and the average c 


orrelation for each set is 
positive, As these 


averages are based on eight 


Discussion 
Before commen: 
Seems worthwhile 


t nfirm Sydiaha’s (1959) 
Suggestion of a Stereotype, le. that there js 
a commonly shared Standard amongst army 


" D 
personnel officers defining the “good ai 
this is a necessary condition for the agree! as 
found in this study between item weights " 
signed in an abstract, analytical situation ^. 
rating shift induced by the item weight : jn- 
assessed against a background of othe 
formation. + that 

The main results have some interest im pett, 
they confirm earlier findings (Spring?” f 
1958) but indicate some modifications 
their interpretation, that 

First of all, there is clear-cut evidence t 
shifts in rating in the direction of i^t the 
are more easily induced than shifts in tial 
direction of acceptance: there is a differen ie 
Sensitivity to negative evidence. However; 28 
logic of the earlier interpretation predictor 
negative correlation between ratings base en 
negative and positive evidence finds no em 
pirical confirmation. Rather, it would 5€ 
that as the interviewer commits himself 7 
deviates further from a noncommittal posit 
the more radically he reacts to informat ^ 
Which threatens the validity of his com? e 
ment. He does react more readily to negat" 
than to positive evidence, but differences de 
tween interviewers seems more accurately 


Reaction to Favorable and Unfavorable Information 


scribed in terms of a readiness to commit 
themselves, both negatively and positively. 
However, those most ready to commit them- 
Selves are quickest to regress to the noncom- 
Mittal position in the face of contrary evi- 
dence, 

The results in relation to primacy effects 
Confirm their presence but suggest a modified 
definition of the term. Those items inducing 
a rating shift disproportionate to their im- 
Portance did so only when they were the first 
to challenge a rating to which the interviewer 
Was committed. As it operates in the inter- 
View situation primacy refers to the first 
change of direction in the evidence. The mag- 
nitude of these effects then become a function 
of the degree of commitment (height of rat- 
ing). It is also a function of the weight of the 
challenging item. It is somewhat of a surprise 
to find that “heavyweight items” induce 
Sreater rating shifts per unit of weight than 
do "lightweight items.” 

In terms of practical considerations it seems 
reasonable to suppose that in those interviews 
Where on-the-spot decisions are made the 
factors relating to primacy effects would 
Operate as they have done in this experi- 
Mental situation, It is not at all certain they 
Would so operate when decisions are deferred 
and all of the information is reviewed before 
making a decision. However, to the extent 
that these findings generalize to the interview 
ìn real life, they point to a danger area, i.e., 
i at an item of information, or the uncover- 
"E of some characteristic, toward the end of 
Š e interview, which runs counter to the gen- 
"ral trend of evaluation is apt to exert undue 

uence—undue in the sense that it will 
“atry more weight than if it had been encoun- 
ered earlier, 


103 


SUMMARY 


This paper reports an investigation of the 
effects of favorable and unfavorable informa- 
tion, and an analysis of primacy effects in an 
experimental situation analagous to the em- 
ployment interview. Earlier findings of inter- 
viewer sensitivity to negative evidence are 
confirmed but the earlier interpretation is 
modified. Primacy effects are shown to be 
related to the first shift in direction of evi- 
dence; magnitude of effects are shown to be 
related to the degree of interviewer commit- 
ment at the point of shift and the weight of 
the challenging information. 


REFERENCES 


Broom, B. F., & BRuxDacr, R. G. Prediction of suc- 
cess in elementary schools for enlisted personnel. 
In P. M. Stuit (Ed.), Personnel research and test 
development. Princeton: Princeton Univer. Press, 
1947. 

Crissy, W. J, & Recan, J. Halo in the employment 
interview. J. appl. Psychol., 1951, 35, 338-341. 

Fercuson, L. W. Personality measurement. New 
York: McGraw-Hill, 1952. 

Grmr, E. H. Judgment of personality characteristics 
from brief interviews. Unpublished PhD thesis, 
University of California, 1951. 

Guirromp, T. P. Psychometric methods. New York: 
McGraw-Hill, 1954. 

Newman, J. H., Bossit, J. M, & CAMERON, D: & 
The reliability of the interview method in officer 
candidate evaluation program. Amer. Psychologist, 
1946, 1, 103-109. 

Sprincpett, B. M. Factors affecting the final decision 
in the employment interview. Canad. J. Psychol., 
1958, 12, 13-22. . 

Syprana, D. The relation between actuarial and de- 
scriptive methods in personnel appraisal. J. appl. 
Psychol., 1959, 43, 395-401. 

Wesster, E. C. Decision making in the employment 
interview. Personnel Admin., 1959, 22, 15-22. 


(Received May 12, 1960) 


Applied Psychology 
Mer Vol. Fo No. 2, 104-110 


OBJECTIVE PERSONALITY TEST AND SOCIOMETRIC 
CORRELATES OF FREQUENCY OF SICK BAY VISITS 


ROBERT R. KNAPP 2 
Naval Medical Field Rescarch Laboratory, Camp Lejeune, North 


The present investigation is concerned with 
the relationships between a number of objec- 
tive tests (as defined by Cattell, 1957b, pp. 
225, 897; Scheier, 1958) of personality, soci- 
ometric ratings, and an index of frequency of 
sick bay visits. Previous studies have demon- 
strated that certain personality test scales are 
related to the incidence of somatic illness 
(Staton & Rutledge, 1955) as well as to the 
nature of the particular disability group 
(Wiener, 1956). The correlations obtained 
with frequency of dispensary visits have gen- 
erally been small, and studies such as those 
above have been limited to the self-report, or 


questionnaire, approach to personality ap- 
praisal. 


In a sample of 95, 
(1955) obtained point 
between a group having 


Staton and Rutledge 
biserial correlations 


requency groups 
none of the js 
male sample while 
cant in the female 
ested that perhaps 
illness associa- 


were considered by Sex, 
reached significance in the 
two of the zs were signifi 
group. The authors suggi 


n several disabil- 
MPI scales, thus 


1 The present data were collected as part of Bu- 
reau of Medicine and Surgery Project Number NM 
18 01 09.1. The Opinions expressed are those of the 
author and are not to be construed as being official 
or in any way representative of the United States 
Navy. 

? Now at United States 
Field Activity, San Diego, 

The author w 
A. Most, Medic. 
sistance in man 


Naval Personnel Research 
California 


ishes to gratefully acknowledge John 
al Corps, USN, for hi 


is invaluable as. 
y phases of the present investigation. 


Carolina 


Suggesting an association between Rare 
measurable personality traits (particu “a 
Hypochondriasis, Depression, and Hysi 
and physical disability. This associa g 
seemed to hold true for disabilities [ e 
thought of as psychosomatic in origin, as e 
as for disability groups, such as those P 
gunshot wounds and flat feet, not eni 
thought of as being associated with perso "s 
ity characteristics. As the author notes, en 
problem of the distinction between predia 
tion and reaction reflected in the test SC 
of these groups was insoluble. ! +» De 
The problem area of the relationship j 
tween sociometric ratings and freq i 
dispensary visits has also been ey bay 
French (1951) found that frequent sic! i 
visits among naval recruits bore a signi tions 
negative relationship to peer nomea ‘ity 
based on a friendship criterion (accepta rela- 
as a liberty companion) but found pa. 
tionship with peer nominations of leader 59); 
Tzard and Manhold (1954) and Izard (1957 
however, reported a significant negabo 
tionship between frequency of sick bay S 
and peer nominations of leadership in 2 x that 
of naval aviation cadets. They also foun mati 
those Ss judged to be in a psychoso etri 
Classification had a lower mean socie 
leadership score. Wellingham (1959) fo its 
similar relationship between dispensary ets: 
and leadership ratings among aviation Ca " 
He also obtained significant negative rela e of 
ships between sick call frequency and far 
Seven academic courses included in the Fi as 
aviation preflight training program, as m 
with measures of physical training, in$ final 
tors’ ratings of officer-like qualities, and 
grades in flight training. tion” 
The present article examines the rela cio" 
Ship between sick call frequency and i e 
metric ratings of pilot proficiency, officer- a 
qualities, and social acceptability in an op€" 
tional air group, and also examines the P 
tionship of sick call frequency to object! 


104 


Frequency of Sick Bay Visits 


personality tests which do not rely primarily 
on the self-report technique. It was felt that 
a personality test of the conventional type, 
such as the MMPI, would not yield unequiv- 
ocal results in this context, because such ques- 
tionnaires use questions relating directly to 
health in order to infer standing on personal- 
ity traits. 


Ordinarily it is probably quite justifiable to . 


use responses to health questions to infer per- 
sonality characteristics, such as is done in the 
case of the Hypochondriasis scale of the 
MMPI. For example, suppose 10% of all 
individuals in the general population respond 
Positively to the question “I have a great deal 
of stomach trouble” (Hathaway & McKinley, 
1951), and only 1% of such persons have 
Senuine organically determined stomach dis- 
orders. Then one will identify a hypochon- 
driac 9 times out of 10 on the basis of a 
Positive response to this question. (A hypo- 
chondriac, for the purpose of this illustration, 
a person complaining of stomach disorders 
vho has no organic stomach trouble.) 
E One were to test only those persons who 
t Sult à doctor specializing in disorders of 
all Stomach, however, it is conceivable that 
> s them might answer the same question 
"Hs tively, It is also conceivable that the vast 
£ jority of these persons suffer from organic 

Omach trouble. For this select population, 
bonus positive response to this question 
ch d be a less appropriate indicator of hypo- 

Ondriasis as defined for our illustration. 

It might be similarly inappropriate to ask 
Westions about health of persons who fre- 
pen Sick bay, and infer personality traits 
tha their answers. The response “Much of 

€ time my head seems to hurt all over, 
Particularly when made by a person who is 
a on sick call, will frequently, among 

i er things, mean he has a psychosomatic 
or der, or he is suffering from some other 
Je disorder, and perhaps less frequently 

he is a hypochondriac. 

h n view of these considerations 

twe any significant correlations found be- 
me €n personality or temperamental traits 
jo cured in objective tests and frequency of 

Ig nny visits would be especially mean- 
Xt ul. The correlations would seem, to some 

ent, to obviate the problem arising in self- 


it was felt 


105 


report data of having to determine whether a 
reported concern over bodily function and dis- 
abilities arose from the disability or whether 
certain personality traits are to be found asso- 


' ciated with a tendency to frequently seek 


medical advice. 


PROCEDURE 
Subjects 


The Ss were 81 Marine Corps officer helicopter 
pilots who were selected for the present project on 
the basis of availability during the testing periods, 
These 81 Ss represent approximately one-third of the 
helicopter pilots then assigned to Marine Air Group 
26 and are considered to be representative of the 
larger pilot population. Since no single squadron 
comprising this air group was large enough to pro- 
vide the total sample, it was necessary to obtain Ss 
from five separate squadrons. All Ss were qualified 
pilots having satisfactorily passed all selection and 
training requirements, and were assigned to helicop- 
ters in the present operational group. They ranged in 
age from 21 to 38 years with a mean of 25 years. 
Educational level attained ranged from 11 to 18 
years of school completed, with a mean of approx- 
imately 13 years. Seventeen percent had gone only as 
far as high school while 5596 had completed four or 
more years of college. Ss ranged in rank from 2nd 
Lieutenant to Major, but the majority (approx- 
imately 779%) were Ist or 2nd Lieutenants. Data 
from health records in which each visit to the in- 
firmary, each diagnosis, and each treatment is indi- 
cated, were available for only 55 of the 81 Ss, so 
that all correlations with sick call frequency are 
based on the smaller N of 55. 


Tests 


The objective personality measures used were 
those comprising Cattell's (1957a) Objective-Analytic 
Personality Test Battery (O-A Battery). The bat- 
tery, as administered for the present investigation, 
consists of some 68 separate tests and test scores 
which are combined in such a way as to yield 12 
individual factor scores. For definitions of specific 
factors in the O-A Battery the reader is referred to 
Cattell (1957b). Tests were administered over a 
period of a day-and-a-half along with some standard 


personality inventories. 


Index of Sick Call Frequency 


For the sample of 55, all visits to sick bay were 
recorded and a total count was obtained for each S. 
Since the pilots in the present investigation had 
varying lengths of military experience, the total 
count was then divided by the number of months 
between the first entry in the record form and the 
date of the present survey. This yielded an average 
number of visits per month for each S based on a 


3'The validities of these personality inventories are 
considered elsewhere (Knapp & Most, 1960). 


106 Robert R. Knapp 


period of at least 7 months but not more than 34 
years. 


Nominations of Pilot Proficiency, Officer-Like 
Qualities, and Social Acceptability 


The sociometric measures used in the present in- 
vestigation were peer nomination forms which car- 
ried brief definitions of the three qualities under 
consideration. Ss were given a list of names of only 
those in the investigation who were from their own 
squadron, from among which each S was asked to 
pick the top 25% and the bottom 25% for each of 
the three variables being rated. Each time a given 
individual’s name appeared in the top quarter a 
score of +1 was given and in the bottom quarter, 
—1. Thus, every person received a score on each 
variable, the possible range of which was a positive 
to negative number corresponding to the number of 
Ss in the particular squadron. 

It was not possible to have everyone in the study 
select from a total list of all other individuals, since 
not all of the Ss would have had an opportunity to 
know all other Ss represented. Therefore, nomina- 
tions were made only from a list of names supplied 
within each squadron, Once scores were obtained for 
each S within his squadron, it was possible to select 
a constant by which all scores within a squadron 
could be multiplied to yield scores comparable from 
one squadron to another. Thus, an S nominated by 
all individuals in one squadron as being top in a 
trait would receive approximately the same score as 
another S picked by all individuals of another squad- 
ron, even though the number of Ss within the 
squadrons was different. The instructions and trait 
definitions for use in the peer nominations were 
given to the participating Ss on separate forms and 
are presented below. The appropriate N for each 


squadron appeared where bracketed percentages are 
shown below. 


Pilot proficiency. From this list of names of 
pilots in your squadron, we would like you to pick 
out (a) the [2596] whom you consider the most 
proficient pilots (considering all the factors that 
are generally thought of as going to make up a 
"professional" naval aviator), and (b) the [25%] 
whom you would rank as the least proficient of 
this particular group. You need indicate no prefer- 
ence or rank within these two groups of [25%]. 
Your answers will be confidential, and you need 
not sign your name. Please note that this ranking 
system applies only to the group of people named 
here. You are not necessarily implying that those 
you picked as “most proficient" are the best you've 
ever flown with—nor that those whom you picked 
"least proficient? have any deficiencies at all. You 
are merely indicating their comparative standing, 
in your opinion, within this group. 

Officer-like qualities. Next, rank the members of 
this group, in the same way, as to the presence of 
those qualities which are ordinarily thought of as 
“officer-like qualities.” In other words, pick the 


[25%] who are (a) considered by you a 
highest in military proficiency and officer- a 1 
qualities, and (b) the [25%] who would rai 
lowest, within this group, in the same qualities. di 
Social acceptability. Lastly, rank the mentee d 
this group, in a similar manner, as to: i UE 
[25%] who seem to “fit in" best with the sq = 
ron as a whole, in other words those who mie 
tribute the most, by reason of their persona od 
and general disposition, to harmony ded 
feeling; and (b) the [25%] whom you conside! is 
rank the lowest in this same category. It may *» 
of help if you would try to pick these mont 
though you were expressing preference s “a 
them, as to those whom you would most call 
have as a companion on an extended cruise (pa St 
from the standpoint of personality and en kd | 
ceptability) and those whom you would leas! | 
sire under the same circumstances. 


RESULTS 


ick call 
The range of mean number of sick 


h. 
visits per S was zero to 2.60 visits per pu 
'The mean number of visits per month fo tion 
55 Ss was .64 with the standard devia 
being .57 visit per month. e 
pin ae correlations oen 5 
objective personality test factors and ue cor- 
visits are shown in Table 1. Significan ^ 
relations were obtained against three clear 
12 O-A Battery factors. To more To 
understand these associations, and a 
present information to others bebe sick 
area of objective test development, th dual 
call index was correlated with each indiv nifi- 
test in the three factors where over-all sig re 
cance was obtained. These correlations 
presented in Table 2. : 
The complete matrix including pers 
variables against which significant aine 
tions with the sick call index were dim et 
is included in Table 3, as are 7’s with 
relevant test and life behavior indices. n be 
b^ 


onalit 
orrela 


Through examination of Table 3, it e 
noted that high positive correlations "s visits 
tained between frequency of sick cal a Tr 
and objective test factors UI (Univ ere of 
dex) 16 and UI 22. Persons scoring eS p 
UI 16 have been depicted (Cattell, 195 : P 
237) as quick, not easily upset, pum d 
strained, more critical, and more deter” 
and effective in their actions. They nae. iple 
further characterized as being insugge* pd 
and as tending to emphasize persona "T 
esthetic values. High UI 22 scores are he 


Frequency oj Sick Bay Visits 107 


TABLE 1 


CORRELATIONS BETWEEN O-X PERSONALITY Test BATTERY SCORES AND 
FREQUENCY OF SICK CALL Visits 


Universal 
Index r with Sick 

Number Factor Title Call Visits 
16 Harric Assertiveness 5T 
17 Inhibition 44 
18 Hypomanic Smartness —.06 
19 Critical Practicality 15 
20 Comention vs. Abcultion —.01 
21 Exuberance 25 
22 Corticalertia ES 
23 Neural Reserves vs. "Neuroticism" —.08 
24 Anxiety Al 
25 Realism vs. “Psychotic Tendency” .21 
26 Self-Sentiment Control —.27* 
27 Apathy .20 


m gnificant at the .05 le 
»- Significant at the .01 le 
ignificant at the .001 level. 


€ indicative of “an alert, eager; controlled, ical reactions are characteristic of persons 


Contact with external events” (Cattell, 1957b, high on this factor. 
P. 251). High cognitive fluency, much speed Tt will also be noted in Table 3 that corre- 


in simple mental processes and bold, uncrit- lations of pilot proficiency (PP), officer-like 


TABLE 2 
CORRELATIONS BETWEEN Sick CALL FREQUENCY AND INDIVIDUAL TESTS 
iN TurEE O-A BATTERY FACTORS 


Master 
Fact unues sok is 
or No. Test Title r 

UL 16 282 Many objects perceived in unstructured pictures Er 
117 “Highbrow” taste in social and esthetic environment 32 
278 Fast reading tempo 28 
125 High emphasis personal vs. institutional values 21 
98 Percentage decisions correct 22 
307 Fast speed of letter comparison 21 
237 High ratio distance on difficult to easy mazes 09 
x 309 Fast speed of line judgment 03 
Uum 282 Many objects perceived in unstructured pictures 31 
6 High ideomotor speed 3 
159 High accuracy of “time required” estimates 93 
9 Fast ideomotor tempo r^ 
U 8 High speed alternating perspective E 
126 36 High ability to state assumptions a 
29 Little carping criticism and mischievous humor _ 30 
191 Low amount considered possible for others in given time 30 
273 High proportion of fluency on self —28 
283 Low proportion of fluency on dreams 48 
206 Low ratio initial to final performance on CMS test — 00 


105 ‘Tendency to perceive many threatening objects in unstructured drawing 04 


108 


Robert R. Knapp 


TABLE 3 
E P BLES 
INTERCORRELATIONS AMONG THE COMPONENTS OF A HYPOTHETICAL CLUSTER OF VARIA 


Sick 
Variable N UI 16 UI 22 UI 26 PP OLQ SA Educ Cal 
UI 16 81 

UI 22 81 Age 

UI 26 81 —45 —.04 

PP 81 —39t** — — 12 16 

OLI 81  —.04 B 16 40*** 

2 81  —.17 .05 .22* 56r OR 

Educ 75  —02 —.06 16 —.11 16 cup 

Sick Call 55 E Y hated A1** are .39** .28* .36** — 07 E 
AQT 48 37% —.07 .02 —.16 .07 —.21 12 e 


* Significant at the .05 level. 
Significant at the .01 level, 
**** Significant at the :001 level, 


qualities (OLQ), and social acceptability 
(SA) with sick call visits were significantly 
negative. 


DiscussroN 


The mean number of monthly dispensary 
visits per S in the present study was .64. 
Izard (1959) reports a mean number of dis- 
pensary visits of 10.84 for their psycho- 
Somatic cadet group and of 6.53 for their 
nonpsychosomatic group over an 8-month 
period, which would represent a mean of 
approximately 1.35 and .82 visits per month 
for the two Broups, respectively. Since these 
latter figures are from a cadet population and 
Since frequency of dispensary visits has been 
Shown to be negatively related to over-all per- 
formance in flight training (Wellingham, 
1959), the lower obtained mean number of 
dispensary visits for the present group of 
Successful trainees was expected. 

It was found that frequency of sick bay 
visits was significantly correlated with three 
of the O-A Battery factors. The association 
Was particularly high with UI 16 (see Table 
1), the obtained + being .57. From the exam- 
ination of the kinds of tests represented in 
some of the O-A Battery factors (especially 
UI 16), it might be hypothesized that intelli- 
gence, or an intelligence related element of 
UI 16, is contributing to a portion of the 
variance accounted for by these factors, Cat- 
tell, Knapp, and Scheier (in press) have pre- 
sented data from the second-order factoriza- 
tion of five studies indicating that first-order 


Factors UI 16 and UI 1 (Intelligence) E 
load, in a positive direction, a secon jnsive 
factor which they have termed ER nal 
Ego-vs.-History of Difficulty in EmA a 
Problem Solving.” Thus, were a porti telli- 
the UI 16 variable actually measuring a ; 
gence it might be hypothesized that EE pre- 
contributor which is accounting for t ration 
dicted variance in the obtained ewe e 
coefficients. In order to further invest aal 
possible contributing factors, two a by he 
variables, intelligence as measured Ample 
Aviation Qualification Test (AQT) ( oduced 
1955) and educational level, were id ed 
into the matrix. The obtained 7’s aie twee 
in Table 3. Although the correlation e a 
intelligence and UI 16 was significant, ap" 
small part of the UI 16 Valle T gence 
pear to be accounted for by the nee intel 
factor. Furthermore, the nonsignifican one of 
ligence-sick call shows that virtually ce ne 
the sick call variance is accounted for T. 
intelligence factor as measured by the ot 1€. 
Thus, it appears that intelligence is Present 
lated to frequency of sick calls in the P 
sample. al 
— of Table 3 will show os 
several relationships which have B. 
tained in other studies were further e jc 
by the present data. First, the socio thos? 
indices seem to be decidedly lower for ed 
who visit sick bay frequently. Secondly, ed 
ucation and intelligence seem to be pare: 
to frequency of sick call visits in this A that 
selected group. It was also demonstrate 


Frequency of Sick Bay Visits 


objectively measurable personality character- 
istics appear to be significantly related to sick 
call frequency in the present male population. 
UI 16 had the highest correlation with the 
Sick call criterion, its validity accounting for 
an estimated 32% of the variance. The nega- 
tive correlation between UI 16 and pilot 
Proficiency ratings was also significant. Since 
the present personality tests do not infer 
Personality traits from self-reported concern 
Over health, it is suggested that the obtained 
Correlations indicate a true relationship be- 
tween the personality traits measured and 
Sick call frequency, rather than being arti- 
facts of the personality testing technique 
used, 

It was also found that those persons fre- 
Quenting sick call were, on the average; rated 
Ower sociometrically. It is impossible from 
the present data to assess whether (a) those 
Visiting sick bay often, for whatever reason, 
3 Té rated lower because of this or; (b) 
Whether those misfitted to the group tend to 
Téquent sick bay or; (c) whether this rela- 
E md is the result of some third factor such 
Bat 16. In any event, those high on O-A 
m" ery Factors UI 16 and UI 22 and, to a 

SSer extent, those low on UI 26 tend to fre- 
Quent sick bay and tend to be rated less 
Proficient as pilots. 
Cat Sing the factor definitions presented by 
rel tell (1957b), the significant personality 

ationships suggest that those visiting sick 
nd More frequently are (from UI 16) char- 
E *rized by fast, determined, effective action; 
on. Comparatively insuggestible; and tend to 
Va] size personal and esthetic, or cultured, 
com’: They tend (from UI 22) to be high in 

Bnitive fluency and may be characterized 
Much speed in simple mental processes 
Dear Y bold and uncritical reactions. They i. 
Con, (from UI 26) to lack a desire for self- 


a and carefulness. — 
Corr, Om an examination of the test anc tac 
8 


ang 


Ns Clates of sick call frequency, one might 
QC ulate that the individual frequenting sick 
E suy ego involved with rper: 
er eal of time wi 

* Eper n a high level of 
he ortance to his own worth, values, and 

‘lth. His concern for following the rational, 
lective, “intellectual” approach leads him 


109 


to sick call with symptoms which other, less 
self-involved persons, may tend to gloss over. 


SUMMARY AND CONCLUSIONS 


A battery of objective personality tests, the 
Objective-Analytic Personality Test Battery, 
was administered to 81 Marine Corps officer 
helicopter pilots, and the 12 resulting factor 
scores were correlated with an index of fre- 
quency of sick call visits. Sick call visits were 
also correlated with sociometric rankings of 
pilot proficiency, officer-ike qualities, and 
social acceptability in their squadron. Three 
of the 12 correlations between personality 
factors and sick call frequency were signifi- 
cant at the .05 level or better. All correlations 
between sociometric rankings and frequency 
of dispensary visits were significant and nega- 
tive in direction. 

The obtained relationships suggest the fol- 
lowing conclusions: 


1. Certain objectively measurable personal- 
ity traits are related to frequency of sick bay 
visits in the present population. Significant 
positive correlations were obtained between 
frequency of sick call visits and personality 
Factors UI 16 and UI 22. A significant nega- 
tive correlation between UI 26 and sick calls 
was also obtained. By using the factor inter- 
pretations presented by Cattell (1957a, 
1957b) it was suggested that those frequent- 
ing sick bay would be characterized by fast, 
determined, effective action and by a tendency 
to emphasize personal and esthetic values. 
They are depicted as high in cognitive flu- 
ency, displaying high speed in mental proc- 
esses, and they tend to be bold and uncritical 
in their reactions, lacking a desire for self- 
control and carefulness. 

2. Sociometric ratings of pilot proficiency 
officer-like qualities, and social acceptability 
were significantly related to frequency of sick 
call visits, with those having the lower socio- 
metric ratings being the most frequent sick 
call visitors. 

: 3. Educational level and intelligence were 
ound to be unrelated to frequency of sick 
call visits and to sociometric ratings, 


Since the present relationshi; 
ce t PS were 
on objective test data rather than lien : 
data, it was hypothesized that stable, spine 


110 


personality, and temperamental traits may be 
the underlying factors related to the tendency 
to seek frequent medical consultation and to 
lower sociometric ratings. It is doubtful that 
either low social acceptability or frequent sick 
bay visits, for whatever reason, would effect 
scores on O-A Battery tests, as might be the 
case with questionnaires such as the MMPI. 


REFERENCES 


Apter, RosaLE K. Characteristics of the revised 
aviation selection test battery administered experi- 
mentally to naval aviation cadets. USN Sch. Aviat. 
Med. res. Rep., 1955, Proj. No. NM 001 108 102.1. 

CATTELL, R. B. Handbook for the Objective-Analytic 
Personality Test Battery. Champaign, Ill.: Insti- 
tute Personality Ability Testing, 1957. (a) 

CATTELL, R. B. Personality and motivation structure 
and measurement. New York: World Book, 1957. 

(b) 

CATTELL, R. B, Knapp, R. R., & Scueter, I. H. 
Second-order personality factor structure in the 
objective test realm. J. consult. Psychol., in press. 

Frencu, R. L, Sociometric status and individual ad- 


justment among naval recruits, J, abnorm. soc. 
Psychol., 1951, 46, 64-72, 


Robert R. Knapp 


Hatnaway, S. R, & McKrntey, J. C. The Mma 
sota Multiphasic Personality Inventory manual. 
(Rev. ed.) New York: Psychological Corporation, 
1951. me. 

Izamp, C. E. Personality correlates E run 
status. J. appl. Psychol., 1959, 43, 89-93. 

Izarp, C. Ee Maxnorp, J. H. Correlates a p 
leadership: I. Medical complaints. USN Sch. 4 MU 
Med. res. Rep., 1954, Proj. No. NM 001 E 

Kxarr, R. R, & Most, J. A. Personality corre a 
of Marine Corps helicopter pilot DEED oi 
USN Med. Fld. Res. Lab. Rep., 1960, No. 1 
09.1.3. 

Scneter, I. H. What is an “objective” test? Psychol. 
Rep., 1958, 4, 147-157. its 

Staton, W. M, & Rurtence, J. A. Measurable rs 
of personality and incidence of somatic E Tee 
among college students. Res. Quart. Amer. 4 
Hlth. Phys. Educ. Recr., 1955, 26, 197-204. -— 

WrrLINGHAM, W. W. Non-medical correlates 0 fiat. 
ical complaints among aviation cadets. J. 
Med., 1959, 30(1), 29-34. E 

Wiener, D. N. Personality characteristics of pahl- 
disability groups. In G. S. Welsh & W. er in 
strom (Eds.), Basic readings on the M Univer: 
Psychology and medicine. Minneapolis: 
Minnesota Press, 1956, Pp. 435-451. 


(Received May 14, 1960) 


Journal of Applied Psychology 
1961, Vol. 45, No. 2; 11-116 


DYNAMIC VISUAL ACUITY AS RELATED TO AGE, 
SEX, AND STATIC ACUITY 


ALBERT BURG axp SLADE HULBERT 


Institute of Transportation and Trafic Engineering, University of California, Los Angeles 


. “Dynamic Visual Acuity,” or “DVA,” as it 
'S conveniently abbreviated, is the term used 
to designate the ability of an observer to dis- 
Criminate an object when there is relative 
movement between the observer and the ob- 
Ject. The term was originated by Elek Lud- 
Vigh and James W. Miller, and their pioneer- 
ing Study of the problem is reported in a 
Séries of research articles (Ludvigh, 1953; 
Ludvigh & Miller, 1953, 1954a, 1954b, 1955; 
Miller, 1956a, 1956b; Miller & Ludvigh, 
1953, 1955, 1956). Ludvigh and Miller found 
that the correlation between dynamic acuity 
and static acuity (both tested monocularly) 
Was very low, and that training did not alter 
this correlation in any way. Additional find- 
ings were that (a) acuity for a moving target 
Steriorates markedly and progressively as 
the angular velocity of the target increases; 
(b) DVA performance can be improved both 
through practice and through increased target 
illumination; and (c) the above findings 
aPPly substantially whether the plane of tar- 
Set movement is horizontal or vertical, or 
Whether the target is moving with the subject 
Stationary or vice versa. 

When the present authors began their in- 
vestigation of DVA, it was with several pur- 
Poses in mind. First, an independent verifica- 
Hon of Ludvigh and Miller’s results appeared 

€sirable, at least insofar as the relationship 

etween DVA and static acuity was con- 
cerned. In this connection it was deemed 
™portant to establish normative performance 
cata for a more heterogeneous group of sub- 
Jects (Ladvigh & Miller used naval aviation 
Cadets) and also to alter their test procedure 
9 permit somewhat easier generalization to 
Pormally encountered visual tasks. 

A second aim was a study of the relation- 
Ship, if any, between DVA and other visual 

€asures such as critical flicker frequency 

FF) and lateral phoria (as stated in terms 


of the ACA ratio?), as well as nonvisual 
factors, namely, age and sex. Also to be 
studied was performance with the subject's 
head fixed (as in Ludvigh & Miller's re- 
search), as opposed to that with the head per- 
mitted to rotate freely. 


METHOD 
Apparatus 


An earlier report (Hulbert, Burg, Knoll, & Mathew- 
son, 1958) describes the experimental apparatus in 
considerable detail, so only a brief resumé will be 
given here. A slide projector mounted on a revolving 
turntable projects a target image on a 180° cylindri- 
cal beaded-plastic screen 4 feet in radius. The screen 
is uniformly illuminated at a level of 10 foot-candles, 
and has an 80% reflectance factor. The subject sits 
underneath the projector platform so that the pivot 
point of the turntable, the center of curvature of the 
screen, and the center of the subject’s head are all in 


"vertical alignment. Figure 1 provides a view of the 


complete experimental arrangement. A variable-speed 
motor permits control of the angular velocity of the 
target across the screen, and the subject's chair has 
been modified to permit mounting a bite board. 

The target used is the familiar checkerboard pat- 
tern utilized in the acuity test of the Bausch and 
Lomb Ortho-Rater. Fifteen slides are used to dupli- 
cate the sequence of target sizes, and the projected 
images subtend approximately the same visual angles 
as the corresponding targets in the Ortho-Rater. 


Procedure 


The subjects monocular and binocular static 
acuity, near and far lateral phoria, and CFF were 
first “determined. Static acuity was measured by 
means of the Ortho-Rater, while phoria and CFF 
were determined by standard techniques. 

The subject was then seated in the DVA apparatus 
and his binocular static acuity again measured, this 
time using the 15 projected targets. The targets were 
presented in sequence, from largest to smallest, and a 
shutter mechanism was used to limit exposure of 


1The ACA ratio is a measure of the amount of 
accommodative convergence for each unit of accom- 
modation, and does not appear to vary with age, 
experience, or training. It is calculated from near and 
far lateral phoria scores. A detailed discussion of its 
use is given by Davis and Jobe (1957). 


111 


112 


Albert Burg and Slade Hulbert 


TABLE 1 
PRODUCT-MOMENT CORRELATIONS BETWEEN (BINOCULAR) STATIC AND DYNAMIC ACUITY 


DVA Ortho-Rater Static Acuity Screen Static Acuity 
Speed F 
°/second N Correlation p N Correlation b 
Free-Head 
20 51 3061 02 51 5238 001 
60 230 .2798 .001 230 .6342 .001 
90 173 .2353 01 179 2674 001 
120 230 .2107 01 230 4601 001 
150 173 1695 02 179 .1406 
180 65 762 63 4136 001 
Fixed-Head 
60 73 .2201 73 .3809 01 
90 73 1411 73 1444 
120 73 A713 .001 73 .3202 01 
150 73 .1290 73 .0734 
EE oom —————————8 
each target to approximately 1 second. For each (20, 60, 90, 120, 150, and 180 degrees/second). Fo, 
target size, the position of the checkerboard was each test speed the subject was first shown a pra! 
different (randomly so) from that in the Ortho- target to familiarize him with the speed and P? jven 
Rater binocular test, and was different in each suc- the targets to follow. DVA tests always Were 2i 
ceeding test, in the order of increasing target speed, and for 4 to 
The next step was to determine the subject's DVA test, target size always progressed from lug 
threshold with the targets moving across the screen — smallest. The subject was required to call out it 
from left to right. Ea 


ch subject was tested (binoc- 
ularly) at four speeds, 


y) and a total of six angular 
velocities were used duri 


ng the course of the research 


Fic. 1. Dynamic visual acuity test apparatus, show- 


ing subject’s head position fixed by means of a bite 
board. 


“bottom,” "left," or “right” for each target F of 
moved across the screen until the combinati Je t 
target size and velocity prevented his being ? Each 
discriminate the position of the checkerboard. e 
target made only one sweep across the sereen, st 
projector carriage shifting automatically to the wed 


: " lo 
slide after the target left the screen, Scoring fol 


TABLE 2 
Test-Rerest RELIABILITY COEFFICIENTS 
? 
Test N P 
oo! 
CFF 66 — .8079 — "cgi 
ACA Ratio 66 — 8217 — "gpl 
Ortho-Rater Static Acuity 80 7359 ot 
Screen Static Acuity 80 5290 
DVA Free-Head (°/second) 01 
60 36 — .6853 “og 
90 27 5961 ‘got 
120 36  .5826 05 
150 27 A279 
DVA Fixed-Head (°/second) 
60 10 .2941 
90 10 .5432 
120 10 2812 
150 10 3777 


R^ XN. c——— 


TABLE 3 
SuwwARY oF MEAN STATIC AND DYNAMIC ÁCUITY SCORES 


Static Acuity 


Dynamic Visual Acuity (Free-Head) 


Ortho-Rater 


Screen Test 20°/Second 60°/Second 90°/Second 120°/Second 150°/Second 180°/Second 

Ag Group N X SD N X SD N X sp N X SD N X SD N X SD N X SD N AX SD 
16- Male 37 1124 37 9397 10 13.00 36 9.58 25 7.04 36 6.00 25 432 12 4.58 
20 Female 72 10.69 72 903 8 11.00 71 790 63 6.56 71 527 63 3.89 12 3.17 
21- Male 45 1122 45 10.73 17 13.06 44 9.32 26 6.38 44 6.09 26 3.73 18 3.89 
25 Female 33 11.12 33 924 5 11.80 32 7.88 24 6.63 32 5.28 24 3.92 8 3.63 
26- Male 16 11.50 16 10.31 6 12.17 15. 8.87 9 6.11 15 5.80 9 3.56 6 3.17 
30 Female 5 10.80 5 880 5 7.00 5 5.60 5 460 5 2.60 
31- Male 6 11.67 6 9.83 2 13.00 5 8.00 3 1.33 5 720 3 400 2 4.50 
35 Female 8 1213 8 10.25 3 11.67 8 8.63 5 6.20 8 6.38 5 320 3 3.33 
36- Male 1 11.00 1 700 1 800 1 600 1 6.00 1 3.00 
40 Female 5 1140 5 8.40 5 820 4 5.50 5 420 4 3.75 1 2.00 
41- Male 1 1400 1 12.00 1 12.00 1 9.00 1 700 1 5.00 
45 Female 3 11.67 3 9.33 3 8.33 3 6.67 3 5.67 3 500 
46 & Male 4 1125 4 9.00 4 7.00 4 500 4 475 4 3.00 2 2.00 
over Female 

Male 110 11.32 1.73 110 10.28" 2.53 35 1289 1.04 106 9.21» 2.31 69 6.58 1.43 106 6.03» 1.93 69 3.90 1.05 40 3.93 1.57 
All Female 126 10.95 1.79 126 913: 1.95 16 11.38: 1.32 124 7.93" 1.88 104 6.47 1.61 124 5.28 1.64 104 3.83 1.32 24 325 1.53 
ages 

Total 236 1112 1.77 236 9.67 231 51 1241 1.33 230 852 218 173 6.51 1.54 230 5.63 1.82 173 3.84 124 64 3.67 1.59 


a Indicates that the difference between the mean score (over all age groups) for males is significantly different from that for females for the given test, with p>. 


b Indicates same 


as above, except with p 2.998. 


999 (t test). 


Kpnoy yous, 20nu0u&q 


EFI 


114 


the same criterion as for the Ortho-Rater, ie, the 
subject's score is the number (from 1 to 13) of the 
last correctly discriminated target preceding two con- 
secutive misses. With the exception of the bite board, 
the procedure for fixed-head subjects was the same 
as for free-head subjects. In every acuity test, static 
or dynamic, the subject was urged to guess at the 
answer when he was not sure. This tends to over- 
come obvious individual differences in preference for 
guessing. 

Subjects were randomly assigned the head-fixed or 
head-free condition, and of those subjects retested, 
some were randomly chosen to be tested under the 
alternate head condition, while the others were re- 
tested under the same condition, 


Subjects 


The research has been in 
years. To date, a total of 236 subjects have been 
tested (110 males and 126 females), and 96 of these 


subjects were tested at least twice. The age range 
was from 16 to 


falling in the 1 
thors’ research 
transportation and driver 


lation of drivers. All subjects were drivers, and were 


University of California, 
ition to both 


RESULTS 


1. As reported in ar 
(Burg & Hulbert, 1959), 
found between CFF and 


tween either CFF or ACA ratio and either 
Static or dynamic acuity. 
2. Low but significant correlations were 
found between DVA and Ortho-Rater static 
these correlations decreasing with in- 


IER Pres velocity and being generally 
ower and less Consistent in trend fi ked- 
head DVA than VA. Thea 


for free-head DVA, The same 
nt in the correlations 


ecent publication 
no correlation was 
ACA ratio, or be- 


Albert Burg and Slade Hulbert 


TABLE 4 
VīsuUAL Acurry EQUIVALENTS OF 
ORTHO-RATER SCORES 


Ortho- Visual 
Rater Angle 
Score (minutes Snellen 
(monocular) of arc) Rating 
1 10.0 20/200 
2 5.0 20/100 
3 3.33 20/67 
4 2.5 20/50 
5 2.0 20/40 
6 1.67 20/33 
7 1.43 20/29 
8 1.25 20/25 
9 1.11 20/22 
10 1.0 20/20 
11 0.91 20/18 
12 0.83 20/17 
13 0.77 20/15 
14 0.71 20/14 
15 0.67 20/13 


tests, the coefficients are all statistically 9! 
nificant and respectably high. ean 

4. Table 3 presents a summary of the ™ 
Static and free-head DVA scores categori or 
by age and sex. There is a consistently we 
performance for the male subjects for po 
test, when age groups are lumped. While on 
of these differences are not statistically A 
nificant, those for the screen static acuity M 
three of the six DVA speeds are significan ‘a 
the .002 level or better (£ test). To per is- 
clearer interpretation, Table 4 shows the V 
ual angles and Snellen ratings correspon®! 
to the Ortho-Rater scores used in Table 3. 

Due to the small number of subjects 10 pe 
higher age brackets, no generalizations ca? M 
made about differential DVA or static act! 
as a function of age. 


Discussion 


* In general, the results of this research suh 
port the findings of Ludvigh and Miller. p 
studies reveal a progressive decrease in mag 
for a moving target as target velocity : 
creases. The fact that the present study forig 
low but significant correlations between sta let 
acuity and DVA, while Ludvigh and MIS, 
did not, is undoubtedly due to the high deg" j 
of similarity between our static and dynam 


Dynamic Visual Acuity 


targets, as well as the more heterogeneous 
subject population we used. Ludvigh and 
Miller used Snellen ratings for static acuity 
and a series of Landolt rings for their (mon- 
Ocular) DVA test, and their subjects were 
young healthy males with 20/20 vision or 
better (uncorrected). Thus, the restricted 
range of static acuity represented in their test 
Population would quite naturally tend to 
reduce the correlation between static and 
dynamic acuity. While the true magnitude of 
the correlation between the two is still uncer- 
tain, the inevitability of their interrelation is 
Not. Quite obviously, a person who is blind, 
9r nearly so, will always do poorly in both 
Static and dynamic acuity tests. . 
Although considerably more representative 
9f the general population in this respect, the 
Present group of subjects also is limited in its 
"ange of static acuity (from 20/13 to 20/40 
Corrected), While this again serves to under- 
emphasize the true correlation between static 
and dynamic acuity, these results are of con- 
Siderable value in generalizing to practical 
Situations. For example, it is the authors’ be- 
fef. that dynamic visual acuity may be at 
Cast as important as static acuity in perform- 
'Ng the task of driving an automobile, and the 
Study reported here represents the initial 
ases of a research project to investigate 
s hypothesis. Consequently, subjects were 
osen (drivers, with static acuity of 20/40 
" better in one eye) so as to maximize their 
Usefulness in the continuing research pro- 
Stam, in which it is intended to use as many 
the same subjects as are available. This 
Would not be possible if a random sample of 
€ general population had béen used. 
he results clearly indicate that factors 
Other than static acuity play an important 
Part in determining an individual's ability to 
iminate a moving object. Even when cor- 


19) 


diser: i 
ected for attenuation the highest correlation 
tween Ortho-Rater static acuity and DVA 
S only 394, In addition, the importance of 
atic acuity as a determiner of DVA becomes 
‘O8ressively less as target velocity increases. 
here is much room for speculation as to the 
precise nature of the nonacuity factors, but 
hey reflect the efficiency of integration of 
lre oculomotor system, aS well as nonvisua 
Ctors such as attention, differential practice 


115 


effect, and experimental error. Research cur- 
rently underway on eye movement analysis 
and visual pursuit efficiency should shed con- 
siderable light on this problem area. 

Static acuity measured on the screen cor- 
related more highly with DVA than did 
Ortho-Rater static acuity. This would be ex- 
pected as the testing situations were more 
similar in that the screen static test and DVA 
tests employ the same target slides and also 
duration of exposure is limited in both in- 
stances. Fixed-head DVA correlated with 
static acuity to a lesser extent than did free- 
head DVA. This is not surprising since the 
limits of angular rotation of the eye alone 
permit foveal vision over an arc of perhaps 
90°-100°, while the combination of head and 
eye movements permits the subject to fixate 
the target over the full 180° sweep. Thus, 
fixing the head reduces the time during which 
the subject can view the target clearly. 

With regard to test reliability it should be 
explained that during the course of the study, 
no attempt was made to equate all test-retest 
conditions for each subject. For example, 
factors such as time of day (and correspond- 
ing fatigue and/or eyestrain), level of motiva- 
tion, interval between test and retest, and 
person conducting the experiment were not 
equated. Three different individuals took 
turns serving as experimenter and although 
all were thoroughly trained, there may have 
been sufficient inter- and intra-experimenter 
variability to have adversely affected the 
reliability results. It is reasonable to expect 
even higher reliability coefficients if care is 
taken to prevent these possible sources of 
variation. Also, it appears likely that testing 
a large number of subjects under fixed-head 
conditions would serve to reveal the same 
consistency of results that prevailed for the 
free-head condition. 

As for age and sex differences in perform- 
ance, no age related differences appeared and 
generally higher scores resulted for males 
than for females. Neither ex post jacto 
“logic” nor previous research offers any clear- 
cut explanation for the latter result. Tt is 
conceivable, for example, that as a conse- 
quence of the experimental situ 
general level of motivation w 
female subjects. However, 


ation, the 
as lower for the 
there is no proof 


116 


that this was the case, and this is a problem 
which merits further investigation. Also, 
additional performance data from subjects in 
the higher age brackets are necessary before 
any conclusions can be drawn concerning pos- 
sible differential age effects in DVA. 


SUMMARY 


The results of this research clearly indicate 
that a person’s ability to discriminate a 
moving target cannot be predicted adequately 
from his static acuity, and that the adequacy 
of this prediction decreases as the speed of 
the moving target increases. 

The exact nature of those factors other 
than static acuity that influence dynamic 
acuity are not yet known, but it is probable 
that they involve the efficiency of the entire 
oculo-motor system. 

No relationship was found between dy- 
namic visual acuity (DVA) and either 
critical flicker frequency or lateral phoria 
(ACA ratio). Also, the small number of 
Subjects in the higher age brackets makes 
impossible a generalization as to the effects 
of age on DVA performance, Finally, the 
results suggest a consistent and significant 
difference in performance between male and 


female Subjects, the latter performing less 
adequately. 


It is suggested that 
number of additional su 
and of all ages will serve 
inconsistencies appearing in these results, but 
it is not expected that the basic conclusions 
will be significantly altered. Once having 
established DVA as a relatively independent, 
reliable measure of visual ability, the next 
Step becomes the study of the relationship 
between DVA and performance in a variety 
of j tasks where discrimination of moving 
objects plays a key role, such as in driving, 


flying, ball Playing, and the like. Studies are 
currently underway toward this end. 


testing of a large 
bjects of both sexes 
to correct the several 


Albert Burg and Slade Hulbert 


REFERENCES 
Bure, A, & Hursrnr, S. F. Dynamic visual p^ 
and other measures of vision. Percept. mot. Skills, 
1959, 9, 334, : . 
DES J. & Jose, F. W. Further studies on M 
A. C. A. ratio as measured on the om 
Amer. J. Optom. Monogr., 1957, No. . 3 
Hutserr, S. f. Bure, A., Kxorz, H. A, & Miror 
SON, J. H. A preliminary study of dynamic bes 
acuity and its effects in motorists’ vision. J. 
Optom. Ass., 1958, 29(6), 359-364. LO cuity 
Lupvicn, E. The influence of dynamic visual m 
on the visibility of stationary objects viewed ud 
an aircraft flying at constant altitude, vole o 
direction. USN Sch. Aviat. Med. joint project A 
1953, No. NM 001 075.01.03. ic 
Lupvicz, E. & Murr, J. W. A study ot du 
visual acuity. USN Aviat. Med. joint project 
1953, No. NM 001 075.01.01. . md 
Lupvicn, E. & Murr, J. W. An analysis of dy tation 
visual acuity in a population of 200 naval m 
cadets. USN Sch. Aviat, Med. joint project 
1954, No. NM 001 075.01.07. (a) f train 
Lupvicu, E, & Murer, J. W. Some effects 0 Avia 
ing on dynamic visual acuity, USN Sch. 1 075. 
Med. joint project Rep. 1954, No. NM 00 
01.06. (b jc 
ees a] & Miter, J. W. The effects on MU 
visual acuity of practice at one angular p ular 
the subsequent performance at a second "n 
velocity. USN Sch. Aviat. Med. joint project 
1955, No. NM 001 110 501.09. ination on 
Muter, J. W. The effect of altered omina SN 
visual acuity measured during ocular purig No. 
Sch. Aviat. Med. joint project Rep, 195% 
NM 001 110 501.12. (a) " 
Mutter, J. W. The measurement of dynamic 
acuity while the observer is rotating. 
Aviat, Med. joint project Rep., 1956, No. N* 
110 501.11, (b) ur 
Murr, J. W., & Lupvicu, E. Dynamic visua! he ev 
when the required pursuit movement of v joint 
is in a vertical plane. USN Sch. Aviat. Med. 
project Rep., 1953, No. NM 001 0735.01.02. cedu? 
Murer, J. W., & Lupvicu, E. A shortened pro 
for the testing of dynamic visual aci, 
Sch. Aviat. Med. joint project Rep., 1955: 
NM 001 110 501.08. tes 
Muirr, J. W., & Lupvicn, E. The results of ratio? 
the dynamic visual acuity of 1000 naval ane) 
cadets. USN Sch. Aviat. Med. joint project 
1956, No. NM 001 110 501.10. 


No 


ting 


(Received May 20, 1960) 


! 


Journcl of Applied Psychology 
1961, Vol. 45, No. 2, 117-119 


WELSH’S INTERNALIZATION RATIO 
AS A BEHAVIORAL INDEX 


BERNARD J. FINE 


Quartermaster Research and Engineering Center, Natick, Massachusetts 


. Welsh (1956) has presented an Internal- 
‘zation Ratio (IR) which attempts to express 
in simple quanitative terms some of the inter- 
relationships between certain of the MMPI 
Scales concerned with mood and behavior. 

The IR is derived by relating the “mood” 
Scales to the “behavior” scales of the MMPI 
as described by the formula 


Hs + D - Pt 
Hy + Pd + Ma 


Where Hs, D, Pt, Hy, Pd, and Ma represent 
the hypochondriasis, depression, psychas- 
thenia, hysteria, psychopathic deviate, and 
YPomania scales in that order. The IR is 
defined so that it yields a theoretical value 
Of 1.00 in “normal” cases. Individuals who 
tend to “internalize” their problems, who 
eXperience somatic symptoms and subjective 
celings of stress, would be expected to obtain 
Scores above 1.00. Those who “externalize” 
eir conflicts would be expected to score 
below 1.00. . 
It would be expected that most individuals 
"ho get into trouble or give trouble to others 
Sither verbally by griping, cursing, and the 
‘ke, or physically by fighting, engaging in 
pound activities and so forth, yon be 
Xtreme “externalizers.” This assumption 15 
Substantiated = data presented by Welsh 
(1956), derived from various studies, which 
indicates that whereas groups of college 
Udents range from .90 to .96 in mean IR 
“Cores, groups of delinquents and prison 
"Dates range from .84 to 89. — AN 
1 certain experimental situations, it is 
esirabie to eliminate the so-called bad 
actors” from among the test subjects. For 
Téasons of economy and administration of the 
Xperiment it would be helpful if the indi- 
Viduals could somehow be identified before- 
and. The research reported here investigates 
effectiveness of the IR in distinguishing 
tween “satisfactory” and “unsatisfactory” 
est Subjects. 


IR = 


Srupy I 
Method 


Subjects. The subjects were 34 enlisted men who 
had been part of a test subject pool at a military 
installation at some time during the period extending 
from early 1957 to early 1959 but who were no 
longer in the pool. The 34 men constituted the total 
population of men whose service as test subjects 
had been terminated during the aforementioned 
period. Of the 34 subjects, 6 were discharged for 
medical reasons and are not considered here, The 
remaining 28 subjects constituted the population for 
this study. 

Procedure. During the course of their terms as test 
subjects, all subjects had completed the MMPI as 
part of a research program devoted to determining 
relationships between personality variables and indi- 
vidual responses to environmental stress. 

From independent records kept at the installation 
it was determined that 13 of the 28 subjects were 
honorably discharged from the Army while still 
serving in the capacity of test subjects, having fully 
completed their tours of duty. These men were per- 
forming satisfactorily as test subjects right up until 
their time of discharge. They will be referred to as 
the “satisfactory” group. The remaining 15 subjects 
had been dismissed as members of the test subject 
pool and reassigned to other duties because of un- 
satisfactory conduct on or off duty while assigned to 
the pool. Unsatisfactory conduct is defined as fist 
fights, auto theft, excess profanity on duty, drunken- 
ness, refusal to perform duties required of them on 
test, and other behavior generally intolerable of test 
subjects in the test subject pool. This group will be 
referred to as the “unsatisfactory” group.i 

It was predicted that the mean IR for the satis- 
factory group would be significantly higher than the 
mean IR for the unsatisfactory group. 


Results 


IRs were calculated for the 28 subjects 
The mean IR for the satisfactory group was 
.91; the mean IR for the unsatisfactory grou 
was .82. The difference between the i 
Putin sore ud significant (¢ = 2.44, 

.025, 1-tailed test), indicati i 
je ir dana, s indicating verifica- 


1The decision to dismiss sub 
the test subject pool was made 
of the performance of the subj 
logical test. 


jects as members of 
by persons unaware 
ects on any psycho. 


117 


118 


TABLE 1 


CLASSIFICATION OF SUBJECTS IN THE SATISFACTORY AND 
UNSATISFACTORY GROUPS ACCORDING TO WHETHER 
or Nor THER IR Scores FELL ABOVE .87 


IR IR at or 
above .87 below .87 


Test subject pool 
Satisfactory subjects 9 4 
Unsatisfactory subjects 2 13 
Field exercise group 
Satisfactory subjects 7 2 
Unsatisfactory subjects 2 6 


Inspection of the IR scores indicated that 
the IR would discriminate most effectively 
between the satisfactory and unsatisfactory 
subjects if .87 was established as a “cut off” 
point. Table 1 shows the classification of 
subjects in the satisfactory and unsatisfactory 
groups according to whether or not their 
scores fell above .87. 

Since the IR was adopted as part of the 
test battery for the selection of test subjects 
for the test subject pool Shortly after this 
Study was completed, it has not been possible 
to validate the .87 cut off point on this 
population. However, a second study, de- 
Scribed below, provided an opportunity for 
further investigation on a similar population. 


Srupy II 
Method 


Subjects, The sub 
two officers w 
cise in a suba 


jects were 13 enlisted men and 
ho participated in a 6-week field exer- 
rctic region during the summer of 1959. 
Procedure. Prior to leaving for the field exercise, 
all of the subjects completed the MMPI. The MMPI 
data were not used in the selection of the men for 
the ficld exercise, IRs were calculated for all of the 
subjects. 


Upon completion of the field exercise, five civilians 
who had ace e trip, acting 


d to rank the 


again. The judges had had 


d serve the subjects both on 
and off duty during the field exercise. 


The subjects who received the nine highest average 
rankings, based on the Preferences of the five judges, 
were termed the “satisfactory group,” and the sub- 
jects who received the eight lowest average rankings 
were termed the “unsatisfactory group.” ' 


Bernard J. Fine 


It was predicted that the satisfactory group woni 
have a significantly higher mean IR than t a 
satisfactory group and that significantly a i. 
jects having IRs above .87 would fall into t S RS P. 
factory category than would subjects having RE 
or below .87. (Conversely, it was expected Ws is 
subjects having IRs of .87 or below would al ina 
the unsatisfactory category than would subjects 
ing IRs above .87.) 


Results 


In order to determine the consistency Ls 
tween the judges in their rankings pd 
subjects, Kendall's coefficient of Wo 
W was computed (Siegel, 1956, p. 229- Tut 
The resultant W of .706 was pese ü 
chi square (Siegel, 1956, p. 236) oin i 
x? of 56.48 (16 df) which is statistical ae 
nificant (p < .01) indicating a high degre de 
consistency between the judges in ranking 
subjects. ! 

The mean IR score for the satisfactor 
group was .98 and for the pent 
group, .85. This difference is statisti 
significant (¢ = 2.05, p = .03, i-tailed). ; 

Table 1 shows the distribution of Ed 
with respect to satisfactory parfongane il 
to the .87 cut off point. The difference $ jj 
in Table 1 between the satisfactory pert 
and the unsatisfactory category is Peter's 
cally significant at the .05 level using FIS?” 
exact probability test (Siegel, 1956, P- 
104). 


Discussion 


uw the 
In both studies, the prediction that ntl 


satisfactory group would have a i pt 
higher mean IR than the unsatisfactory f. 
was substantiated. In addition, the pre@e" ye 
that the .87 cut off point, derived (el 
data of the first study, would dison n 
significantly between satisfactory an was 
satisfactory subjects in the second study 
verified. it 
In general, it appears that the IR has y^ > 
as a behavioral index although more g^, 
eralizable data from a larger population 
certainly necessary and desirable. veh 
The question may still be raised, pe 
as to the advantage of using the IR over inc 
of the separate scales that compose it ar 
some or all of them may discriminate as W 


Welsh’s Internalization Ratio 


Certainly, if one or more scales perform the 
same function individually as they do in 
Combination (IR), then parsimonious pro- 
cedure would demand using the former. Ac- 
Cordingly, the mean standard scores on each 
of the six scales used in deriving the IR 
Were computed for the “satisfactory” and 

Unsatisfactory” groups in the first study. 

nly the Ma scale significantly discriminated 
detween the two groups. Classification of 

Satisfactory” and “unsatisfactory” subjects 
According to their Ma scores using two cut off 
Points yielding maximum discriminability re- 
Sulted in the failure of the Ma scale to dis- 
criminate significantly between subjects on 
the Satisfactory-unsatisfactory dimension. Ex- 
amination of the data indicated that the 
obtained significant difference between the 
two group Ma means was due to two Or three 
extreme scores in one of the groups. 

An identical analysis of the mean scale 
Scores for the satisfactory and unsatisfactory 
Sroups in Study II yielded essentially the 
same results, Here, the D and Pf scales sig- 
nificantly discriminated between the groups 

"t neither scale approached significance 1n 
di Scriminating between individuals. Again the 
Significance between the means was appar- 
ently due to a few extreme scores. 

The IR shows consistency from Study I 
= Study II both in discriminating between 
S'Oups and individuals. The individual scale 
Scores are inconsistent in discriminating be- 
S groups and fail to discriminate jenes 

ividuals. Clearly, then, the combination o 

e six scales into the IR yields something 
Which the six scales taken individually do 


119 


not? Emphasis on the relationship between 
the scales rather than on the absolute values 
of individual scale scores appears to be 
warranted, in the case of Welsh's Internaliza- 
tion Ratio. 


Summary 


Two studies are reported. In the first, 13 
men were classified as “satisfactory” and 15 
men as “unsatisfactory” test subjects by 
independent criteria. Welsh's MMPI-derived 
Internalization Ratio (IR) significantly 
discriminated between the two groups. In 
addition, an IR of .87 was found to discrimi- 
nate maximally between individuals. 

The second study, using 17 men engaged in 
a field exercise, further validated the IR as 
an index of group desirability. Furthermore, 
the .87 cut off point again significantly dis- 
criminated between satisfactory and unsatis- 
factory individuals. 


REFERENCES 


Sizozz, S. Nonparametric statistics for the behavioral 
sciences. New York: McGraw-Hill, 1956. 

Wzrsu, G. S. An anxiety index and an internal- 
ization ratio for the MMPI. In G. S. Welsh & 
W. G. Dahlstrom (Eds.), Basic readings om the 
MMPI in psychology and medicine. Minneapolis: 
Univer. Minnesota Press, 1956. Pp. 298-307. ` 


(Received May 20, 1960) 


2 It has been pointed out and should be noted here 
that a combination of scale scores, such as in a 
discriminant function analysis, could possibly per- 
form in this situation, despite the lack of consistent 
findings using the scales individually. However, 
unless such a procedure resulted in significantly 
greater predictability over that obtained here adhe 
the IR, the latter would appear to be the idee 
convenient method. 


Applied Psychology 
ee ag NS. 2, 120-122 


SOME ASPECTS OF ATTEMPTED, SUCCESSFUL, 
AND EFFECTIVE LEADERSHIP? 


BERNARD M. BASS 


Louisiana State University 


The origins of leadership were examined in 
detail in Parts III, IV, and VI of Leadership, 
Psychology | and Organizational Behavior 
(Bass, 1960). A variety of theorems can be 
derived from an examination of this discus- 
Sion concerning the development and change 
of individuals in their attempts to lead, their 
Success in influencing others, and the effec- 
tiveness of their leadership. 

The present Paper describes tests of the 
empirical adequacy of three theorems: 


1. Successful leadership is related more to 
ability in effective compared to ineffective 
groups. à 

2. Successful leadership is related more to 
esteem in effective compared to ineffective 
groups. 

3. Discrepancies between esteem and self- 


esteem are manifested in unsuccessful leader- 
ship. 


The logic of each theorem 
in more detail as we consid 
analysis developed to verif 
pirical adequacy of each theorem, 


SUBJECTS AND METHOD 


Subjects of these experiments 


cadets meeting in 51 problem-solvi 
men each, 


were 255 ROTC 
ng groups of five 


TURNS pe asured 
sions, and total group participation were mea 
by clocks activated by voice.? 


Successful Leadership is Related w— 
Ability in Effective Compared to Ineffec 
Groups 


i hy- 
Available data made it possible to test the hy 


z : ngly | 
pothesis that proficiency is associated more strong! 


with successful leadership in effective rather ie n. 
effective groups. This hypothesis was sugges Len 
reasoning that if the more able members of E 
Succeed as leaders rather than the less er 
their leadership will be more effective and 


ive, BY 
quently the group will be more effective E 


date elsewhere (Bass, 1960) the relatio nd group 
successful leadership, effective leadership, b ccessful 
effectiveness which in summary is: ‘oups t9 


leadership must occur usually in order for gr be de- 
become more effective, Such leadership can 
Scribed as effective leadership" (p. 133). 1 
Next, we relate ability and effective ind wil 
Summarizing the agreement by stating: leader) D 
be effective, if he is successful (as a cia the 
(D. 139), “He who has the ability to so 
Sroup’s problems" (p. 140), ition that 
A further derivation leads to the proposi t me 
a group will be more effective when its abie e 
bers are given high status so that they ar ongtU 
likely to succeed as leaders: “the higher EUM the 
ence within a group between status and ta 7): 
more effective the group is likely to be" (p. "ne? 
These propositions fit with our corollary d bel? 
here that if we observe that the ablest m the 
Were indeed the successful leaders of a grouP. ‘cient 
group would be more effective than if less prO 
members were most influential. 1 
The 51 groups of ROTC cadets, each eVa ise" 
10 problems, provided data concerning 510 to the 
sions. These 510 discussions were divided Eo 55 
255 most Publicly effective discussions and E 
least publicly effective. Then, the 1250 meast! 
— 9 


adershiP 


see 

*For details concerning these measurement i, 

Bass, Flint, and Pryer (1957) and Bass, Pryer; . 

and Flint (1958). deci 

? Public effectiveness was the extent the grouP n the 
Sion following discussion was more accurate tha 


120 


Attempted, Successful, and Effective Leadership 


relative success as a leader (à) * drawn from the 255 
effective discussions among five members each were 
correlated with their scores for initial accuracy 
(RX).5 Public and private success as a leader were 
not used because total public or private successful 
leadership was found in another analysis to be higher 
in effective group discussions and lower in ineffective 
group discussions (Bass et al, 1957). On the other 
hand, by its construction, total relative success was 
always zero for any single discussion, per se. 

For the 1250 measures of ability and leadership 
drawn from the 233 effective discussions, à highly 
Significant product-moment correlation of 43 was 
found between any member's initial accuracy (RX) 
and his subsequent relative success as à leader (A). 
When the same product-moment correlation between 
Mitial accuracy and relative successful leadership 
Was calculated for the 1250 paired values of ability 
and leadership drawn from the less eífective discus- 
Slons, a coefficient of only .07 was obtained between 
Initial accuracy and relative success as à leader. 
(This difference between 43 and .07, each based on 
1250 Cases, was significant at the 1% level according 
to a £ test of the significance of the difference be- 
tween z coefficients although with this large df it is 
difficult not to obtain statistical significance. Pre- 
sumably because of the large df's, the reliability of 
the obtained coefficients of 45 and .07 is high enough 
to be looked at as a stable description of the effective 
aa d ineffective discussions and the question of sta- 
tistical significance is not too appropriate.) : 

The analysis was repeated for private effective- 
ness. A product-moment correlation of 48 was ob- 
tained between initial accuracy and relative suc- 
cessful leadership for the 1250 scores drawn from 
the 255 discussions more effective privately while a 
Significantly lower corresponding correlation of .36 
Was obtained for the 1250 scores drawn from the 

5 less effective discussions. 
fi The data strongly support the argument that pro- 
idency of members is associated with their success 
as leaders more in effective in contrast to ineffective 
Eroups, particularly publicly effective in comparison 
i Püblidy ineffective ones. It would se that 
Plato's Proposal to have philosopher-kings has merit 
‘Gia 


Average initial ranking by the members of the familiar- 
x Public effectiveness of a discussion 


P asd initial opinions (X1) and the other members 
Be Opinions (Yj) minus t eT 
ey Ween the first member's final opinions 
etyone else's initial rankings (Xi). Tt can be shown 
atA =0 
For conveni ls RX refer to the rho 
e onvenience the symbols RA ie Th 
ran lâtion between a rank order like X, the initial 
eri, NGS of a subject before discussion, and R, the cri- 
om Tank order, RY is the symbol of pry. — 
bety Tivate effectiveness = PRY — PRX: the difference 
cassie the average accuracy of members after a dis- 
‘on than before the discussion. 


121 


if the “philosophers” are proficient in solving the 
kingdom’s problems. m 


Successful Leadership Related to Greater Es- 
teem in Effective Compared to Ineffective 
Groups 


By esteem, we mean the judged value of a person 

to his group regardless of his position in the group 
(Bass, 1960, p. 277). A member’s esteem usually is 
positively associated with his ability (pp. 284-285). 
As just shown, a member’s ability correlates more 
with successful leadership in effective groups than 
in ineffective groups. It follows that esteem also is 
likely to associate more with success as a leader in 
effective groups than in ineffective groups. In other 
words, the successful-effective leader is expected to 
be more esteemed than the successíul-ineffective 
leader. The coach who guides his football team to 
victory is the hero; the losing coach is hanged in 
effigy. 
The 51 ROTC cadet groups were divided into the 
25 highest in public effectiveness on all 10 discussions 
combined and the 25 lowest on the average in all 10 
discussions by each group. This pooling was neces- 
sary since esteem was based on opinions at the end 
of the last discussion only. One group, middlemost 
in effectiveness, was discarded. The -correlation be- 
tween esteem? and relative success as a leader (A) 
was .42 for the 125 members of the 25 effective 
groups while it was only .22 for the 125 members of 
the 25 ineffective groups. For the 250 cases, the 
difference between .42 and .22 was significant at 
the 596 level of confidence. The experimental results 
again fit nicely with the logical expectation that 
successful-effective leaders are more esteemed than 
successful-ineffective ones. 


Unsuccessful or Aborted Leadership as a Con- 
sequence of the Discrepancy between Es- 
teem and Self-Esteem 


In Chapter 16 on conflicts in groups (Ba 
we deduced that members whose iban d 
is greater than the esteem accorded them by thei 
associates (E) will attempt more leadership but wil 
succeed less. 


the higher a member's self-esteem, th i 

he is to attempt leadership. The liher fa pay 
teemed by others in the group, the more succes: fal 
he will be in his attempts to lead. It follows that 
if a member’s self-esteem is much higher rel. e 
to his esteem, he will attempt leadership. Gut lo 
attempts may be rejected by the other ders 


(p. 322). (Quoted b; issi 
Brothers) y Permission of Harper & 


? Esteem was based on a five-poi i 
tent a member's removal from <a E e n 
loss to the group. A member's essem QE) an bel 
average rating assigned him by the other m bey 
the end of the last discussion. His self-est oo ik 
rating of himself on the same five-point scale CF) ae 


122 


A significant positive correlation of .32 was found 
between self-esteem and attempted leadership. The 
corresponding correlation was .19 between self-esteem 
and relative success as a leader among the 255 cadets. 
On the other hand, esteem correlated .36 with suc- 
cessful leadership but 42 with attempted leader- 
ship To contrast the success of "modest" and 
"immodest" cadets who attempted similar amounts 
of leadership, it was thus necessary to control the 
individual covariance between esteem, self-esteem, 
attempted, and successful leadership. "Therefore, 252 
of our 255 ROTC cadets were divided into 126 who 
were above the median in attempted leadership and 
126 who were lower in such attempts on the average 
during all 10 discussions combined. Then, for each 
subject, the average extent others in his group 
esteemed him at the end of the last of the 10 discus- 
sions (E) was subtracted from his self-esteem (F). 
A high discrepancy score (F-E) suggested “im- 
modesty," an appraisal by the subject that his self- 
esteem was greater than the esteem others accorded 
him. A low (F-E) score or a negative one implied 
"modesty" or appraising oneself as relatively less 
valuable in comparison to the group’s opinion. Each 
sample of 126 was subdivided into a subsample of 
63 “modest” and 63 “immodest” men. Thus, four 
subsamples of 63 men each were isolated, low or high 
in attempted leadership and modest or immodest. 

Among the 126 cadets above the median in 
attempted leadership, the 63 “modest” cadets earned 
an average leadership success (A) of 37; the 63 “im- 
modest” cadets earned an average success of only 
10. Again, among the 126 ca 
in attempted leadership, the 
more successful than the 
tempting similar amounts 
relative success of the “ 
it was —.37 for the “immodest.” 


correlation between 
ship. (One must 

succeed at it.) Similarly, 
ing scores of .1 


datum fede en more successful 
out the lev "T 
% level) than the immodest" cadets 


5 These were aver; 
obtained within n 
motivation and aj 


ages of product-moment correlations 


ine samples at differing levels of 
mounts of organization, 


Bernard M. Bass 


attempting similar amounts of leadership who earned 
average relative leadership success scores of —.14. 
The interaction of “modesty” and attempts to lead 
was not significant. 


SuMMARY 


Theorems from Leadership, Psychology 
and Organizational Behavior (Bass, 1960) 
provided hypotheses for experimental tests. 
This paper reports experimental verification 
of the following: 


1. A highly significant correlation of .45 
was found between initial accuracy and rela- 
tive successful leadership in 255 more effec- 
tive discussions, while the correlation was 
only .07 in 255 less effective discussions. J 

2. A correlation of .42 was obtained be 
tween esteem and relative successful yr 
Ship in 25 groups with higher average ef inia 
tiveness on the 10 problems, while - 
correlation was .22 in the 25 less effect! 
groups. A 

3. Those men whose self-esteem outweig s 
their esteem exhibited a mean success a 
leaders of —.14 while those whose esteem pa. 
higher than their self-esteem earned a we 
cessful leadership score of .18. pan 
effects emerged when the differences in 
tempts to lead were controlled. 


REFERENCES 


Bass, B. M. Leadership, psychology and org" 
tional behavior. New York: Harper, 1960. GrouP 
Bass, B. M, Frivt, A. W., & Prayer, M. W. d suc 
effectiveness as a function of attempted an 1957: 
cessful leadership. "Technical Report 12, 0NR 
Louisiana State University, Contract N 
35609. 


nis- 


INT: 
Bass, B. M., Pryer, M. W., Garr, E. L. & Fion, 
A. W. Interacting effects of control, motiv ted 


e 
group practice, and problem difficulty on ater 
leadership. J. abnorm. soc. Psychol, 1955: 
352-356. 


(Received May 26, 1960) 


hed . 


Journal of Appli " 3 
1961, Vol. 49^ Nn. 27 123-108 


CUMULATIVE COMMUNALITY CLUSTER ANALYSIS 
OF WORKERS' JOB ATTITUDES 


ROGER HARRISON ! 


Procter and Gamble Company 


À recent paper by Wherry (1958) examines 
the Similarity of factorial structures obtained 
from four studies of the SRA Employee In- 
Ventory, Wherry (1954, 1958) concludes that 
the factors found in the four studies are quite 
Stmilar, though the claim has been disputed 

aehr, 1956), Wherry named a general and 

Ve group factors: Working Conditions, 
“nancial Reward, Supervision, Management, 
and Personal Development. 

This Paper extends the investigation of the 
reliability of dimensional analysis to another 
Paper-and-pencil job attitude questionnaire, 
Using a method of analysis different from 
those applied in the studies cited by Wherry. 

tyon’s cumulative communality cluster 
analysis was applied to the intercorrelations 
Of the items from job attitude questionnaires 
administered to two industrial work groups. 

clusters found are compared with the 
actors named by Wherry. 


PROCEDURE 


piles, All hourly paid men (N = 350) from 
ee A, a medium sized manufacturing installation, 
ere administered a job attitude questionnaire. All 
Surly paid men (N = 650) from Plant B, a large 
Manufacturing plant, were administered a similar 
Nestionnaire, Plants A and B made similar products, 
"AS similar processes. M 
co te questionnaires. Each of the two questionnaires 
Mained about 100 questions, of which 80 in each 


Questionnaire were of sufficient general interest to 


We 


Selected for factor analysis. Sixty-eight questions 
shud Common to the two questionnaires. The present 
th dy focuses on the ways in which responses to 
qe latter questions were clustered in the two 
alyses, 
s items were written in such a way as to present 
the, * subjects an aspect of their job situation whic 
Y Were to rate on a five-point scale. The points 
1 
S ide at Vale University. 
Rober author wishes to express “lS i t 
t C. Tryon of the University of California 
ad ley) for his suggestions and to Daniel E. 
factor Of the same institution for carrying out the 
bling. analyses. Beverly A. Veatch carried out the 
clustering of the variables. 


his appreciation to 


123 


on the scale constituted five degrees of satisfaction 
with the job aspect referred to in the question, 
ranging from “very unfavorable,” through “neutral,” 
to “very favorable.” 

The factor analysis. The responses to the items 
were dichotomized so as to give as near to a 50-50 
split as possible between high and low groups. 
Tetrachoric correlations were computed among the 
items, using a program developed for the IBM 650 
by H. W. Garrison and M. Charap of Educational 
Testing Service. Tryon’s cumulative communality 
cluster analysis was applied to the intercorrelations, 
using an IBM 701 program written by D. E. Bailey 
and J. O. Neuhaus of the University of California 
(Tryon, 1958). 

Cumulative communality cluster analysis begins 
with a multiple group method factor analysis. An 
objective criterion for the selection of groups is 
programed, resulting in the selection of the item 
in the matrix with the highest variance of squared 
correlations as a pivot variable for each group. 
Other items whose profiles of correlations in the 
matrix are similar to those of the pivot variable 
are grouped with it, and the first factor is passed 
through the centroid of this group. The process is 
repeated on each succeeding residual matrix. This 
method of factoring results in orthogonal factors 
which are close to simple structure. That is, the 
factors are so located that each item is likely to have 
high loadings on only a few factors. 

The program continues to extract factors until 
at least 97.5% of the common variance of the matrix 
has been accounted for by the loadings of the items 
on the factors. The program reiterates to stable 
communality estimates. 

Following the multiple group factor analysis, items 
were grouped into clusters on the basis of the pat- 
terns of their factor loadings. The method used was 
suggested by R. C. Tryon in a personal communi- 
cation. 

A table which showed the coordinates of each 
of the variable domains (variable vectors) on each 
of the factor dimensions was constructed. The co 
ordinates were obtained by dividing each of the 
factor loadings by the square root of the variables 
communality. Items were included in a cluster if 
their profiles of coordinates were judged simil. ? 
ie. if their vectors had similar directions in the 
space defined by the orthogonal factors. Items an 
low communalities (below 40) were excluded D 
further consideration. ‘ SATOR 

The clusters were constructed by blind 
using only the coordinates of the variable 
The analyst did not know the item conte: 
ables from the two questionnaires were 


analysis, 
domains. 
nt. Vari- 
clustered 


124 


separately and independently. At the conclusion of 
the blind clustering, there were a few variables (5 
to 10) about which it had been diíficult to make 
a clustering decision. Such variables were finally 
clustered using the item content as a guide to 
choosing one of two possible clusters. 


RESULTS AND DISCUSSION 


The clusters of items arrived at in the 
two analyses are quite similar. Table 1 is a 
joint frequency distribution of variables in 
the clusters from the two plants. It shows 
how items which were clustered together in 
one analysis tended to be clustered together 
in the other. 

Table 2 lists the job aspects referred to 
by the items in each cluster in the two 
analyses. There are 12 distinguishable clusters 
in the Plant A analysis and nine in that of 
Plant B. It appears that Plant A employees 
made finer distinctions than did the workers 
in Plant B. Some items which are separated 
into two clusters in Plant A tend to be 
grouped in the same cluster in Plant B. For 
example, this is true of items dealing with 
attitudes toward one’s foreman. The same 
items form three clusters in Plant A; they 
are in two clusters in the Plant B analysis. 


Joint Frequency DISTRIBUTION OF VARI 


Roger Harrison | 


The kinds of variation represented by the 
clusters in Table 2 are quite similar to those | 
discussed by Wherry (1958) in his summary | 
of four factor analyses on the SRA Employes 
Inventory, and the clusters found in t E 
present two analyses may easily be groupe | 
under the headings from Wherry's summa 
The lack of a general factor in the presen! 
study is, of course, an artifact of the E 
of method of analysis. The correlation be 
tween the cluster domains (vectors) ranges 
from .16 to .94 with a median of .51 in E 
analysis of Plant A, and from .31 to .84 ue j 
a median of .56 in the analysis of B. Cea | 
a general factor could have been eH 
if desired, since the cluster domains are dà 
positively correlated with one another. In is 
present study, since the objective was ia 
study patterns of response rather than 3 
search for underlying independent factors | 


general factor was not extracted. The vd or 
cant finding is that whether one looks in 
underlying dimensions of job attitudes, ie 
the Wherry rotation, or whether one see s0 
group items according to similar patte ae 
response as in the present study, the $ 


kinds of variation emerge. 


ABLES IN THE Two CLUSTER ANALYSES, BY CLUSTERS 


TABLE 1 


CITTA Plant B Clusters mM 
Clusters — 1 2 40 4d d X 5 E G Uh Av By pa | 
1 9 1 E 
2 2 2 1 : 
3 1 3 ; 
4 5 1 1 a © 
5 4 2 
6 3 1 1 : 
7 1 1 
8 5S 4 g 1 : 
9 1 2 A 
10 3 1 : 
11 1 1 2 
12 2 E 
Ua 1 1 1 
Ad 2 : 
Be 1 1 
Total 14 3 6 4 3 3 10 8 2 2 1 1 
a Items which did not fit into a cluster, 


Cluster Analysis of Workers Attitudes 12 


[21 


TABLE 2 
DESCRIPTIONS OF THE CLUSTERS IN EACH ANALYSIS 


Plant A 
Cluster and Description 
1. Consideration and helpfulness 
of the foreman 
2. Foreman’s competence in admin- 
istering the department 
3. Foreman’s competence in super- 
vising the job 
. Physical working conditions and 
nervous tension on the job 
5. Advancement opportunities 
. Employee plans and benefits 
. Earnings 
. Administrative competence of 
higher management 
9. Higher management's considera- 
ation for employees 
10. Friendliness and fairness of 
management 
. Importance of one's job 
. Facilities: medical department and 
locker rooms 


Pe 


e uo 


Plant B 


Cluster and Description 
1. Consideration and helpfulness of the 
foreman and his competence in 
departmental administration 
2. Foreman’s competence in super- 
vising the job 
. Physical working conditions 
Advancement opportunities 
. Employee plans and benefits 
. Earnings 
. Administrative competence and 
consideration for employees of 
higher management 
8. Friendliness and fairness of 
management 
. Willingness of company to spend 
for safety and working conditions 


nO RO 


o 


"There is thus some reason to believe that 
| fied kinds of variation in job attitudes identi- 
tion by Wherry are not specific to the ques- 
Sio naire used nor to the method of dimen- 
D nal analysis. Some caution in interpretation 

the present results is, however, necessary. 
everal recent studies have shown that job 
tebects variously interpreted as “sel f-realiza- 
a or as opportunities for advancement, 
“cognition, and achievement, are central to 
anagers’ evaluations of their jobs (Herz- 
a Mausner, & Snyderman, 1959; Schwarz, 
th 9). Appropriate questions for evaluating 
i ese aspects of workers’ jobs were not 
Ncluded in the present questionnaires, So 
bs results are not directly comparable to 
i "dies which have found such factors 
™portant, 


SUMMARY 


Joh attitude questionnaires were adminis- 
red to two groups of industrial workers. The 
‘Westionnaires were analyzed separately for 
| © two groups by Tryon’s cumulative com- 
, (à d cluster analysis. The resulting 
| ii arrived at by blind analysis, were 
| ared with each other and with the fac- 


tors summarized by Wherry. The two sets of 
clusters were quite similar. 

Kinds of variation indicated by the clusters 
were similar to those found by Wherry in his 
summary of four analyses of the SRA Em- 
ployee Inventory. 

Tt is concluded that these kinds of varia- 
tion may be common to a wide variety of 
industrial jobs and that they do not depend 
strictly on the questionnaire used or on the 
method of analysis. 


REFERENCES 


BAEHR, MELANY. A reply to R. J. Wherry concern- 
ing “An orthogonal re-rotation of the Baehr and 
Ash studies of the SRA employee inventory.” 
Personnel Psychol., 1956, 9, 81-92. ; 

Herzserc, F., MAUSNER, B, & SNYDERMAN, B. The 
motivation to work. New York: Wiley, 1959. 

Scuwarz, P. A. Attitudes of middle management 
personnel. Paper read at American Psychological 
Association, Cincinnati, September 1959, 

Tryon, R. C. Cumulative comm i 
analysis. Educ. psychol. Measmt., pw Ea 

Wuerry, R. J. An orthogonal re-rotation of the 
Baehr and Ash studies of the SRA employee i e 
ventory. Personnel Psychol., 1954, 7, 365-380 as 

ee J; Factor analysis of morale data; 
a sped and validity. Personnel Psychol., 1958, 


(Received June 9, 1960) 


Journal oj Applied Psychology 
1961, Vol. 45, No. 2, 126-127 


FURTHER EVIDENCE OF A PRACTICE EFFECT 
ON THE MILLER ANALOGIES TEST 


ROBERT M. COLVER asp CHARLES D. SPIELBERGER 
Duke University 


In a study designed to test the hypothesis 
that scores on the Miller Analogies Test 
(MAT) improved with practice, Spielberger 
(1959) found that scores were significantly 
higher on retest with an alternate form of the 
MAT for each of three independent samples 
of 5s. But since all of the Ss in these samples 
were either graduate students or senior under- 
graduate honor students in psychology, it was 
Suggested that the observed practice effect 
might be limited to "bright, psychologically 
sophisticated students" who might profit more 
from practice than less intelligent, less sophis- 
ticated Ss. The present study was designed to 
evaluate the effects of practice on the MAT 
for senior undergraduate Ss with less psycho- 
logical sophistication and lower mean intelli- 
gence than characterized the Ss in the pre- 
vious study. 

METHOD 

Forms G and H, alternate forms of the MAT, 
Were administered in a counterbalanced order to 36 
liberal arts seniors enrolled in a course in educational 


Psychology for nonpsychology majors. The major 
fields of study of t (8), foreign 


languages (3), political science (3), history (12), re- 
ligion (1), 


(8). None of these Ss ha 


Form H. Nineteen Ss (Group IT) 
me forms in reversed order. 


RESULTS 


Prior to analyzing the data, Form H raw 
Scores were made equivalent to Form G scores 
1 The cooperation an 


by adding two points to all Form H scores in 
the 30 to 70 score range (Miller, 1952, p- E 

The means and standard deviations for Me 
two successive administrations of the er 
are presented in Table 1. The mean score a 
tained by the Ss on their second ae me 
with the test was significantly higher than t 
mean score for their initial performance d 
= 5.66, p « .001). Of the 36 Ss, 27 ce ot 
The consistency of the improvement was ii 
ther indicated by the Pearson z between 1 
tial and final scores of .86. Iter- 

Since a lack of equivalence between a er's 
nate forms was suggested in Spielberg a 
study, the data were further evaluated - a 
analysis of variance (Lindquist, 1953, de 
II design) which tested the equivalence a 
Forms G and H as well as the effects. of p the 
tice. This analysis was possible since tet- 
alternate forms had been given in pan 4 
balanced order. This analysis also provic ‘hil 
more precise test of a practice effect 1n ale 
the ¢ test for related measures used to € in 
uate the difference between the m 
Table 1 was based on an error term in er- 
by variance attributable to the use of 4 
nate forms of the MAT. d 

The means and SDs for Groups I 2n be 
are presented in Table 2 where it WEY at 
noted that Form G appears to be sone o 
easier than H. The F test of the a 5, 
form, however, was not significant (F = tice 
p> .05). The F test of the effect of APT k 
was highly significant (F = 34.29, p < - 


TABLE 1 
Means AND SDs ror Two Successive ADMIN 
TIONS OF THE MILLER ANALOGIES TEST 


(N = 36) sil 


ys RAT 


First Test Second Test 


SD 


Mean SD Mean 


56.30 11.88 62.00 11.86 


126 


Practice Effect on the MAT 


TABLE 2 
MEANS AND SDs ror Forms G AND H oF THE MILLER 
ANALOGIES Test GIVEN IN A COUNTER- 
BALANCED ORDER 
(V = 17 in each group) 


——-" at eus 
Form G Form H 
Sample Mean SD Mean SD 
mp gm m Sm om 


Note.— 
Tandom]y 


For this analysis two of the Ss in Group IT were 
eliminated in order to have equa } Ns in each group. 


CONCLUSION 


p study was designed to determine 
whether the practice effect on the MAT found 
Y Spielberger (1959) was due to the fact 
m his Ss were bright, psychologically so- 
» isticated students. In the present study, 27 
36 liberal arts students (7576) improved 
eir scores and the mean increase in score for 
© group was highly significant. This result 


127 


was similar to the findings of Spielberger’s 
study where 39 out of 48 psychology students 
(81%) improved and where the magnitude 
of improvement was also highly significant. 
Thus, it would appear that the observed prac- 
tice effects on the MAT are not limited to 
psychology students. The findings of this 
study support the hypothesis that scores on 
the MAT improve as a function of previous 
experience with an alternate form of the test 
and that this improvement is not limited to 
bright, psychologically sophisticated students. 


REFERENCES 


Lixpouisr, E. F. Design and analysis of experiments 
in psychology and education. Boston: Houghton 
Mifflin, 1953. 

Murr, W. S. Miller Analogies Test manual. New 
York: Psychological Corporation, 1952. 

SPIELBERGER, C. D. Evidence of a practice effect on 
the Miller Analogies Test. J. appl. Psychol., 1959 
43, 259-263. ] 


(Received June 9, 1960) 


l of Applied Psychology 
eee, ag No, 2, 128-129 


THE PAIRED-COMPARISON METHOD AND CENTRAL 
TENDENCY EFFECT IN ESTHETIC JUDGMENTS 


JAMES E. KENNEDY 1 


Bureau of Industrial Psychology, University of Wisconsin 


Woodworth (1938), in reviewing the litera- 
ture concerned with esthetic preferences for 
geometric figures, noted that the results of 
these studies may be confounded by the cen- 
tral tendency effect, i.e., the tendency for the 
subject to prefer figures toward the center of 
the range presented to him. For example, a 
study by Witmer is summarized in which it 
was found that when subjects were presented 
with various graduated series of isosceles tri- 
angles, the particular triangle most preferred 
was the one in the middle of the series, How- 
ever, when the range in the series became too 
extreme the central tendency effect was not 
obtained and subjects preferred figures to- 
ward the less extreme end of the series. 

Austin and Sleight (1951), pursuing the 
problem, believed that the effects of this cen- 
tral tendency bias could be eliminated by 
presenting the figures to the subject by the 
paired-comparison method. The procedure 
they used was as follows. Twelve triangles 
(altitude-to-base Proportions ranging from 
1" X 1" toà" x 1”, by 1" altitude Steps) were 
combined in all possible combinations of two 
(66 pairs in all) and mimeographed on sepa- 
rate sheets, The order of each pair of triangles 
on the pages, as well as the pages themselves 
in the booklets, were randomized. 

The results of that study are shown by the 
solid line in Figure 1. Triangles with altitudes 


between 1 and 2 inches were most preferred, 


Is this preference for figures in the center of 
the range dependen 


t or independent of a cen- 
tral tendency effect? Austin and Sleight con- 
cluded a central tendency effect was not oper- 
ating since the figures had been exposed two 
at a time in a random Sequence. 

The purpose of the present study was to 
explore the problem a Step further by essen- 
tially replicating the Austin and Sleight study 
with one basic change. Since Austin and 

1 Grateful acknowledgement is made to R, Mathias 


for providing subjects and to R. Spence for perform- 
ing the calculations. 


Sleight had used only one series of stimuli 
their study, there was no basis for inen £ 
whether preferences were or were not E 
pendent upon a series effect. In the presi d 
study three series of stimuli were used cor i 
sponding to the lower, middle, and at. 
parts of the range used by Austin and "y 
More specifically, Series I consisted t d 
isosceles triangles ranging from 4 x al 
1i" X 1", by i" altitude steps. (Since di 
triangles in the original study as well wo 
current study had a 1 inch base, we wil ate 
henceforth to triangles in terms of erg di 
tudes.) Series II and III consisted of 12 k 
angles each and ranged from j" to 21 MN. 
from 13” to 3", respectively. Three se i 
booklets were prepared each containing ws a 
three series of figures. The three types E 
booklets were distributed randomly to o ry 
of 78 nursing students taking an introdu rol 
course in psychology. One-third of the T 
responded in booklets containing Senen t 
ures, one-third to Series II figures, an 
remaining to Series III figures. three 
The results are presented in . the arent 
broken line curves in Figure 1. It is apr fer 
from inspection of the curves that the ee 
ences are, despite the use of the paire 


he 


ALTITUDE IN INCHES 


r e 
Distribution of esthetic preferenc 


o! 
Fic. 1. atio? 
isosceles triangles of different altitude to base nt Y€ 
(Solid line represents original Austin and ue om 
sults and broken lines represent the three serie 
current study. All triangles had 1-inch bases.) 


128 


Esthetic Judgments 


parison method, not free of a series effect. A 
oe for figures in the middle of the 
eie is seen in the Series I and II. The ab- 
"ii of the central tendency effect in the 
findin III data is consistent with Witmer's 
not i that the central tendency effect was 
Sit: when the range in the series became 
4 vend The central tendency bias can hardly 
eliminat to have been “minimized” let alone 
tm ted, when one finds such marked dis- 
angle € mn preferences for a particular tri- 
d o en it appeared in the different series. 
east xample, the 14” triangle is among the 
ut etere when it appeared in Series I 
Deed d the most preferred when it ap- 
Sleigh in either the original Austin and 
Or t series or in Series II. 

concludes basis of these results it must be 
Paired ed that Austin and Sleight’s use of the 
ec -comparison method did not eliminate 
e kial tendency effect from their data. In 
su ne of making their initial judgments, 
edge b apparently did acquire some knowl- 
m about the range of stimulus figures in- 
ed and did persist in showing preferences 

Bures in the middle of the range. 
ettet. K it is clear that the central tendency 
Stud lid operate in the Austin and Sleight 
cieu, it is not clear under what particular 
e je nstances the central tendency bias could 
ee Dected to be a problem in other prefer- 
at Studies. One cannot help but speculate 
Subj : might occur in any situation where the 
ME have no preferences oF only very 
Well | Preferences for the stimuli. This could 
be the case for isosceles triangles. Faced 


129 


with the task of expressing preferences when 
they have none, subjects may resolve their 
dilemma by simply choosing the least extreme 
of the stimuli presented to them. 

There is no reason to believe that this bias- 
ing effect would be limited to studies involv- 
ing esthetic preferences. For example, Hall 
and Bennett (1956) investigated the optimal 
diameter of handrails of public stairways by 
having subjects express preferences for diam- 
eters of 1.5, 1.75, 2.00, and 2.50 inches. The 
curye they obtained from this study has the 
same general shape as Austin and Sleight’s 
and the possibility must be considered that 
the subjects were only exercising a central 
tendency bias in the absence of preferences 
for the stimuli as such. 

Two solutions to the problem come to mind. 
One, as in the present study, several parts of 
the stimulus range can be explored separately 
to determine if preferences for the stimuli as 
such override a possible series effect. Two, by 
having different subjects express preferences 
for each individual stimulus value no series 
effect could operate. 


REFERENCES 


Austin, T. R., & SLEIGHT, R. B. Aesthetic preference 
for isosceles triangles. J. appl. Psychol, 1951, 35, 
430-431. 

Hatt, N. B. Jr, & Bennett, E. M. Empirical assess- 
ment of handrail diameters. J. appl. Psychol., 1956 
40, 381-382. 1 

WoopwonrH, R. S. Experimental psychology. New 
York: Holt, 1938. á 


(Received June 17, 1960) 


al of Applied Psychology 
1961 dd, A! No. 2 10 196 


DESIGN AND INTERPRETABILITY OF ROAD SIGNS' 


ROBERT W. BRAINARD, RICHARD 


J. CAMPBELL, axp EDWIN H. ELKIN ° 


Ohio State University 


A standardized series of road signs is now 
in use in some of the Western European 
countries. The signs have received consider- 
able attention in this country, partly due to 
their uniqueness and partly due to their ap- 
parent ease of interpretation. The signs make 
minimal use of language, attempting instead 
to convey the desired information through 
pictorial and symbolic representation. 

The road signs used in the United States 
are often ambiguous, require considerable 
time to interpret, depend primarily on written 
language, and lack standardization. These 
factors lead to incovenience, loss of travel 
time, and may contribute, directly and in- 
directly, to accidents. It thus becomes a 
matter of importance to investigate means for 
improving the interpretability of United 
States road signs, 

It was the purpose of the present study to 
determine how well the European signs could 
be interpreted, and to relate these findings to 
Sign preferences (stereotype). More specif- 
ically, the aim of the study was fourfold: 
(a) to investigate the interpretability of the 
European road signs; (b) to determine if 
Stereotypes exist for signs appropriate to 
highway use, and if 50; (c) to determine if 
there are general characteristics which are 
common to both the Stereotypes and the 
easily interpreted European signs; and (d) to 
test the effectiveness, in terms of increased 
interpretability, of signs based on stereotypes. 


METHOD 


The study was 
two phases employed different 


1 This research was conducted in the Laboratory 
of Aviation Psychology. á 

? We gratefully acknowle 
couragement of P, M. 
conduct of this study, 

R. W. Brainard is now at Ni 
tion, Tnc., Columbus, Ohio, 

E. H. Elkin is now at Tufts College. 


wledge the advice and en- 
Fitts during the Planning and 


orth American Avia- 


ird 

their meaning, are shown in Table 1. The 
phase was an attempt to assess the effect OL 
experience with the signs on their ee stereo’ 
The fourth phase was designed to discover tbe He: 
types for sign meanings. The fifth phase es signs 
cerned with determining the interpretability eh n 
derived from Phase IV, ice, the interpretability 
signs based on the stereotypes. . -— 
Phase I. Thirty of the European signs? ba 
duced on 12” X 12" display cards) were Be Ss 
singly. Each sign was displayed to a raup ds 
for a period of 30 sec. The Ss were asked veyed: 
the meaning which they thought the sign ^ eae 
within the display interval. At the end o d 29 
interval a new sign was shown. The Ss Ket á 
students in an introductory psychology cou 
Ohio State University. "m. the 

An interpretation was scored as corre conve) 
three Es independently judged S's response bis. 
the same meaning as the European sign Percentage 
The interpretability score used was the pe 

Í correct interpretations. jn 
, ux JL; The same signs were presented jie 
Phase I. However, in this phase Ss were the sig” 
with an answer sheet containing a list ca the list 
meanings. They were asked to choose from n being 
the one meaning which best matched the oe intro- 
shown. The Ss were 33 students from anothe 
ductory psychology course. 

The score used was the percentage © 
matchings. hase 

Phase III. This phase immediately followed P itely 
II and used the same group of 33 Ss. Wr sign 
following the matching test (Phase ID, ime their 
were again presented to Ss, but this ping the 
meanings were given orally by E. Hollow P was 
single reading, the method used in Phas 
repeated and similar scores were obtained. j 

Phase IV. In this phase, 31 new Ss were p 
with blank sheets of paper with space for d read 
ings per page. Sixteen sign meanings we! " 16 
aloud, one at a time, at 2-min. intervals. ins 30 
sign meanings were derived from the. one ign 
European signs and included the meaning 21, 3 
1, 2, 5, 7, 10, 11, 12, 13, 15, 16, 17, 18, 2 rss in- 
24, 25, 27, 29, and 30 (see Table 1). In ae i 
Stances signs having the same or similar m po 
(24 and 25; 2, 14, and 29; 21 and 22) WEE 
and only the one meaning was given. Td 
siderations determined the sign meanings to bat 
in this phase. First, it appeared desirable EY: the 
set of sign meanings should be chosen so he c^ 
entire range of interpretability, found in the ! d 
XU ning 


ir mean? 
SA decal, showing the signs and their soci 
Was obtained from the British Automobile 
tion. 


130 


f correct 


ovided 
draw- 


Design and Inter pretability of Road Signs 


EE phases, would be sampled. A second con- 
sideration was the applicability of the sign meanings 
to the American highway scene. 
_ The Ss’ task was to draw, within the 2-min. 
interval, a sign which they felt would convey the 
desired meaning and be easily interpreted. They 
were told that no words could be used, but that 
pae and/or outline shape oi the sizn could be 
™ployed as coding dimensions. 
b. evi were classified by the Es. The number 
eae es which corresponded to the European 
maj Was determined, as was the percentage 0! 
or alternative sign designs. In one instance 
ign #1), Ss were unable to draw a sign compa- 


n 
Able to the European sign because the use of words 


Was not permitted. . 

ary y. In this final phase 10 sign b gren 

the dr; 12" display cards) were Lade. rom 

Used. P» of Phase IV. Twenty-nine new eines 

GEM ene conditions and scoring proce vat 

in thi e same as in Phase I. The 10 signs involve 
is phase are shown in Table 2. 


RESULTS 

t Tnterpretability. 'The percentage of correct 
€sponses in Phases I, IL, and III is pre- 
Sented in Table 1. Some signs which were 
peult to interpret in Phase I were easily 
puerpreted in Phrase II (eg. No. 19, 27, 28, 
easi] but in general the signs which were 
asily interpreted in Phase I were likewise 
ĉasily interpreted in Phase I. A Pearson 
toduct-moment correlation coefficient be- 
Sen the first two phases was computed as 
$06, which is significant at the .01 level. - 
here were fewer correct responses Un 
I I than in Phase II, an average of 54 ^d 
py U5 74% correct responses, respectively. In 
Si ase III (after Ss were given the correct 
pu meanings) the average interpretability of 

‘siens approached 100%. —— 
u tereotypes. Also presented in Table 1, 
Brut “American Stereotypes,” i$ à summary 
the types of signs drawn in Phase IV. 
“se drawings were separated into three 
ad categories. The first category included 
BE drawings which were directly compa- 
le to the corresponding European sign. 
fis Second category was composed of the 
"Wings which were comparable to either the 
1l esponding European sign; but with some 
Other added or subtracted, or d inm ia me 

ing;.., "urope igns. These in 
cuicated b s to the number of the 
wh Parable European sign plus the notation, 
en applicable, of the element which was 


131 


added or subtracted. The third category in- 
cluded drawings which were distinctly dií- 
ferent from any of the European signs. In 
these instances a brief description of the 
drawing is given. 

The results of Phase IV indicate that 
stereotypes for many of the signs appeared in 
the drawing responses. Thus, three of the 
sign meanings (No. 27, 7, 18) led to drawings 
which were sufficiently similar that the major 
characteristic of the drawing was found in 
100% of sign designs. Signs No. 20, 15, 10, 
25, 5, and 13 likewise produced good agree- 
ment among Ss, one major characteristic ap- 
pearing from 45% to 81% of the time. 
Moderate amounts of agreement were found 
in the remainder of the signs, and in many 
instances those characteristics which Es have 
isolated as separate actually have some 
similarities. Although Ss were given the op- 
portunity to use color and sign shape as 
coding dimensions, insignificant use was made 
of them. 

The 10 signs which were constructed from 
the results of Phase IV are presented in 
Table 2. Included in this table are the per- 
centages of correct interpretations of these 
signs, found in Phase V, together with the 
interpretability scores for the European signs 
which correspond in intended meaning; the 
latter scores were obtained in Phase I under 
experimental conditions identical to those of 
Phase V. It can readily be seen that the 
interpretability of the signs based on the 
stereotypes is superior to that of the cor- 
responding European signs in all instances. 
The interpretability score for 7 of the 10 
stereotype-based signs was significantly higher 
(p > .05) than that of the corresponding Eu- 
ropean sign(s). However, four of the stereo- 
type-based signs had a relatively low inter- 
pretability index (76, 59, 59, and 38%). 

Relation between interpretability and 
stereotypes. Comparison of the findings for 
the stereotypes with their counterparts in 
Phases I and II provides the data relevant 
to the third aim of the study, the relation 
between stereotypes and the interpretability 
of the European signs. E 

In general, as percentage of correct inter 
pretations in Phases I and II increased the 
number of drawings of similar signs in Phase 


132 ES W. Brainard, R. J. Campbell, and E. H. Elkin 


TABLE 1 


SumMary or RESULTS FROM Puases D rüROUGH LV 
Í 


Interpretability American Stereotype 


Percentage Correct Percentage of Total Responses 
Responses Phase IV 


Compar- 
able to 
Phase Phase Phase European 


Alternative Alternative 
Sign No. I IL HI Sign I II =. 
58 
100 91 100 00 Hand, red, 16 
or black Red circle 
23 
13 100 94 100 52 ‘Tracks crossing 219, 
S eats a road Sign #2 
18 100 97 94 100 00 00 
DOvaLE 
FIRST TO LEKT 


T 7 B 6 
nd $9 a 38 Arrow to left 


DANGEROUS 
LEFT BEND 


26 100 97 100 
CHILOREN 
A 


? 93 97 100 
PEDESTRIAN 
chastine 
| 
á l 
19 93 97 97 
OPENING 
BRIOGE 


93 ; 
omnc n 88 81 
ERE 


93 82 100 


ll, 86 100 “100 


42 

10 e 83 100 100 45 Sign #10 
SPEED usur. 
30 KPH 


with a car 


i A 16 91 97 100 00 00 
Misi 
27 A 76 76 94 100 


[LI 


00 00 


2€ z 233. 
2 LEVEL ctostisc e 18 91 16 52 Tracks crossing 
30 CATES 


Sign #13 a road 


Design and Interpretability of Road Signs 


TABLE 1—(Continued) 


Interpretability American Stereotype 
Percentage Correct Percentage of ‘Total Responses 
Responses Phase IV 
Compar- 
able to 
" Phase Phase Phase European Alternative Alternative 
Sign No. I I IH Sign I 
Sr ur sanum De i RR Ee m MM 
E 02 85 100 
DICTO. wl 


BE FOLLOWED 


26 10 

30 © | 59 97 97 6 Hand before Man holding 

atta horn ears 
PROHIBITED, 


[A 29 3 
2h A 55 73 100 23 29 a I 
* d is = Man slipping Shiny road 
UP y 
"ROAD 
j 29 
15 A 55 73 94 68 Sign #15, 
mmm with a car 
19 10 
29 52 85 97 58 ‘Tools along Red hand on 
ROAD WORK road road 
AN PROCRESS 


16 


16 A : 45 65 94 32 Si Sign #16, 
on . with a car without 10° 
mite 
r 36 16 
23 | € 24 58 100 00 Barrier in Red or yellow 
pott left lane line 
9 t 


36 16 
10 42 94 00 Barrier in Red or yellow 
left lane line 
x = 29 
42. Arrows all 
2 10 53 88 00 Barrier in directions 
MANET left turn but left 
52 10 
l (5) 00 30 85 00 Barrier in Hand in front 
vum am road of car 
KOOR VIRUS 
" 52 10 
u O 00 59 94 00 Barrier in Hand in front 
CLOSED TO, road of car 
ALL Venictts 
00 11 88 
00 91 100 
00 97 100 
00 70 94 


134 


IV also increased. In Phase I, the signs with 
low scores (below 1546 correct) were gen- 
erally those which made use of some abstract 
coding dimension (e.g., circle and/or slash 
line to denote a prohibitive action). The 
signs with high scores (above 85% correct) 
were characterized (a) by having a direct 
counterpart in the American road sign system, 
and/or (5) by being a direct pictorial repre- 
sentation of the sign meaning. Several signs 
which were easily interpreted in Phase I, but 
were not included in Phase IV, also were 
direct pictorial representations (t.g., children 
crossing, bridge opening). In Phase IV, when 
the European sign was not drawn (or could 
not be drawn as was the case for the STOP 
sign), the general characteristics of the signs 
which were drawn were the same as those 
present in the easily interpreted European 
signs. 

The most explicit evidence for the relation 
between stereotypes and interpretability 
comes from the results of Phase V. It can be 
clearly seen from Table 2 that all 10 of the 
Stereotype-based signs were easier to interpret 
than their European counterparts; the aver- 
age interpretability score for the former signs 


was approximately 75% as compared with 
45% for the European signs. 


Discussion 


The results indicate that the most readily 
interpreted signs fall into two major cate- 
gories: (a) signs employing directly pictorial 
representations (e.g., road narrows, children 
crossing); and (b) Signs having direct 
counterparts in existing American signs 
(e.g, STOP, RR crossing). Signs in both of 
these categories have the major featu 


being unambiguous; the former inheren 
and the latter thro 
had with them. 


Difficulties in interpretation appeared when 
unfamiliar and/or ambiguous coding dimen- 
sions were used. For example, the two codes 
used in European signs to indicate prohibited 


action, circle and slash, were not immediately 
Clear. 'The use of these two codes resulted, in 
fact, in many reve 


rsals (e.g., left turn for no 
left turn, motorcycles permitted for no motor- 
cycles permitted, and vehicle permitted for 


re of 


tly so 
ugh the experience Ss have 


R. W. Brainard, R. J. Campbell, and E. H. Elkin 


TABLE 2 


SUMMARY OF RESULTS FROM PHASE V 


Percentage of Correct 
Interpretation 


Corresponding 


Stereotype- 
based 


Signs Phase V European Sign - a 
oo 
100 45 (Sign #16) 
o: i 100 55 (Sign #15) 
ferai 
PAL 
| d 100 76 (Sign #7) 
u 
| i 12) 
8 10 (Sign # 
j 3 93.— Jj Sign #23) 
E Ww 
lem 
e q 90 50 (Sign #30) 
sns 
id — 90 76 (Sign #27) 
76 55 (Sign #24) 
] 
= 59 10 (Sign #21) 
59 52 (Sign #29) 
‘ 0 (Sign #1) 
ra S8 0 (ien #11) 
g 
583 wau nent m 


did 
no vehicle permitted). The Ss apparenti g " 
not recognize the presence of the cire rete? 
code, and the slash was often misinterP 
(e.g., as being an overpass) or norm this 

The pictorial signs that were used Mos oí 
Study were generally unambiguous. er of 
the easily interpreted pictorial signs M jc 
a unidimensional character, employing 


Design and Interpretability oj Road Signs 


ture or symbol without the additional coding 
dimension of the circle and/or slash. It ap- 
Bears then that the use of additional symbolic 
dimensions tends initially to confuse the sign 
interpreter, 
ES ee to the purely pictorial signs, 
as ag signs do not seem to lend themselves 
tion, T to an immediate, correct interpreta- 
- The symbolic signs which were inter- 
m easily in this study had counterparts 
oli € American road system. The two sym- 
€ signs which had no counterpart in the 
Eon road system (Signs 11 and 17) were 
S interpreted correctly in Phase I and 
h only moderately well interpreted in 
ase IT, 
Ens of the major questions raised by the 
esent study involves the relation between 
ios eoiypEs as demonstrated in the draw- 
ee and the interpretability of signs as 
asured in the early phases. Do the stereo- 
z^ incorporate those features which make 
Beton” in interpretation? The authors hy- 
E rS that they would. Thus, we would 
5 ct to find some common basis underlying 
Sos. S interpreted sign and the stereo- 
tiated pris hypothesis is partially substan- 
Which y the results of Phase IV. The signs 
and E most difficult to interpret in Phase I 
S in Bee II were rarely, if ever, drawn by 
Which hase IV. On the contrary, those signs 
ear} Were more readily interpreted in the 
Phas Phases were drawn more frequently in 
io. IV. In addition, the results showed 
the Where the stereotype did not agree with 
still rice sign design, the signs drawn 
Were ended to reflect those features which 
signs characteristic of the easily interpreted 
their When the drawings were compared with 
hard-to-interpret European counter- 
Ber for any given meaning, the drawings 
‘tally substituted a pictorial figure or 
“iar symbols for an abstract or unfamiliar 
Dictus, Z., an “X” across 4 sign, or a 
for 3 € of a barrier or hand was substituted 
Circle or slash). 
€ results from Phase V emphasize even 
pret, directly the beneficial effect on inter- 
ate cd when signs are based on a demon- 
for s Stereotype; the interpretability score 
erap, € 10 stereotype-based signs was gen- 
y Considerably higher than for the 10 


Darts 
Rej 
f; 


135 


corresponding European signs. Furthermore, 
there were no instances of interpreting a sign 
to mean the very opposite of its intended 
meaning; such reversals were frequent among 
some of the European signs. On the other 
hand; the last four signs shown in Table 2 
were interpreted only moderately well, al- 
though they were based on a stereotypical 
level comparable to the remaining six signs 
which yielded high interpretability scores. It 
appears, therefore, that a moderately high 
degree of consensus (e.g, 30-40%) among 
sign designers is not always sufficient as a 
basis for designing highly interpretable signs. 

It must be remembered that the above 
results deal primarily with immediate inter- 
pretability of signs. The results of Phase III 
showed that after Ss were told only once what 
the signs meant, the interpretability of most 
signs approached 100%. This suggests that 
some signs having little or no correspondence 
with the stereotype may still be interpreted 
correctly by almost everyone who has a small 
amount of experience with the sign. A very 
different result might be found, of course, if a 
new meaning were given to an old sign for 
which a strong stereotype existed. 


SUMMARY AND CONCLUSIONS 


The purpose of the present study was to 
investigate the interpretability of selected 
European road signs, to determine if stereo- 
types existed for signs, to compare the gen- 
eral characteristics of the European signs 
with the characteristics embodied in the 
stereotypes, and to determine if stereotype- 
based signs enhanced interpretability. 

Interpretability was investigated by two 
different methods. In Phase I Ss wrote the 
meaning which they thought a sign con- 
veyed, and in Phase II they chose from a list 
of possible meanings the one meaning which 
best matched the particular sign being shown. 
In Phase III, the same Ss who had partici- 
pated in Phase II were told the meanings of 
the signs. Then the signs were presented again 
and Ss wrote the meaning which they 
thought the sign conveyed. Phase IV investi- 
gated the stereotypes for road signs. In this 
phase, sign meanings were read to Ss, and 
they designed signs which would convey these 


136 © 
meanings. Stereotype-based signs were con- 
structed from the results of Phase IV; the 
` interpretability of these signs was determined 
in Phase V. 
The results of the study can be summarized 
as follows: E 


l. Interpretability of the European signs 
was partly a function of the method by which 
interpretability was examined. The mean 
interpretability score from Phase I was con- 
siderably lower than for Phase II, although 
the correlation between the two methods was 
significant. 

2. The European signs were moderately 
well interpreted on first presentation; aíter 
one exposure to the correct meaning, inter- 
pretability approached 100%. 

3. The easily interpreted European signs 
were generally pictorial representations of the 


R. W. Brainard, R. J. Campbell, and E. H. Elkin 


sign meanings or were counterparts of Ameri- 
can road signs. The signs which were difficult 
to interpret generally used abstract, unf: 
miliar symbols or included ambiguous cues. - 

4. Stereotypes for some road signs exist. 
The general characteristics found in the 
stereotypes were the same as those in the 
easily interpreted European signs. 

5. Interpretability is enhanced if signs are 
stereotype-based. However, signs based 0T 
stereotypes of only moderate strength (30- 
40%) will not always be highly interpretable 

6. A small number of the European roa 
signs could be efficaciously used in the Unite 
States, without necessitating prior instructio? 
as to their meaning. The majority of the 
signs, however, could not be used without 4 
minimal degree of familiarization. 


(Received June 29, 1960) 


.. Journal of Applied Psychology 


| 
VoL. 45 No. 3 


JUNE 1961 


GENERALIZED THURSTONE AND GUTTMAN SCALES 
| FOR MEASURING TECHNICAL SKILLS 
IN JOB PERFORMANCE" 


DOUGLAS G. SCHULTZ asp ARTHUR I. SIEGEL 


Psychological scaling techniques have, for 
of Most part, been used in the measurement 
attitudes and sensory phenomena. There 
nun increasing interest, however, in the 
Dro iii of these methods to measurement 
ogy associated with industrial psychol- 
(1960) 1 example, Mosel, Fine, and Boling 
of scali have recently reported the usefulness 
ing for estimating worker requirements. 
Me rady described here investigated the 
Scaling ility of the Thurstone and Guttman 
Purpos techniques for job skill measurement 
Se in several related jobs. Generalized 
lies ient instruments based on these tech- 
S would allow a more economical means 

ni uring on-the-job performance with a 
establis of different forms. Moreover, the 
ier ishment of a common, scaled job skill 
“tarchy would provide a kind of common 
woul across the related specialties. This base 
valua h ave implications for cross-specialty 
Mogg 25 job task analysis content, career 
Uire the establishment of training re- 
EE € across specialties, etc., and pos- 
aro, eet give some basis for grouping 
mo m for various purposes. Thus à pos 
Might panum for describing related jobs 
T € provided. i 
Dos a series of studies which had as one pur- 
for e development of criterion measures 
nliste, Posttraining performance evaluation of 
d personnel in various naval aviation 


med under Contract Nonr 
Psychological Services and 
e are indebted to 
enson, and P. 
ut the work. 


1 "n 
Zap gts Study we 


(00) 1 as perfor 


fiice Pelween Applied l 
Fi Smith of Naval Research. W 
derm, J. Nagay, G. D. Mayo, S. B 

an for their assistance througho 


137 


Applied Psychological Services, Wayne, Pennsylvania 


ratings (job specialties), Technical Behavior 
Check Lists (TBCLs) were developed for four 
ratings (Richlin, Siegel, & Schultz, 1960; 
Siegel, Richlin, & Federman, 1960). The 
TBCLs were comprehensive, detailed lists of 
the tasks performed by men in each rating. 
They were found to possess those character- 
istics customarily thought to be essential in a 
sound criterion measure. However, it was felt 
that application of psychological scaling meth- 
ods might lead to shorter and more convenient 
check lists and, at the same time, add further 
substance to their meaning. 

The scaling approaches employed were those 
proposed by Thurstone (1929) and by Gutt- 
man (1950). Siegel and Benson (1959) 
demonstrated the scalability, in both the 
Thurstone and Guttman senses, of the skills 
involved in the naval aviation electronics tech- 
nician specialty. Siegel, Schultz, and Benson 
(1960) achieved similar results for the skills 
involved in the Naval specialty of aviation 
machinist’s mate. Although the check lists de- 
veloped in these studies were of value for the 
posttraining evaluation of technicians within 
a particular rating, it appeared that a short, 
scaled check list which would apply to sev- 
eral ratings would have wider significance, 
even greater usefulness, and would also be of 
considerable interest from the standpoint of 
extending the application of scaling tech- 
niques. 

The purpose of the present study, there- 
fore, was to investigate whether technical 
proficiency criterion measurement instruments 
could be constructed which could be applied 
across several related naval job specialities 


138 


(ratings) and which could be scaled across 
these ratings by both the Thurstone and Gutt- 
man techniques. Achieving this purpose em- 
braced two steps: (a) developing behaviorally 
based items that were general enough to ap- 
ply to the skills included in the several rat- 
ings and yet covered the important duties of 
each rating, and (b) scaling the items over 
the several ratings. 

Electronics was selected as the broad area 
within which the research would focus. The 
following four naval ratings, which were avail- 
able for study, were felt to involve skills of 
various related types within electronics: avia- 
tion electrician's mate, aviation electronics 
technician, aviation fire control technician, 
TRADEVMAN (Training Devices Man). 


DEVELOPMENT AND ADMINISTRATION OF THE 
PRELIMINARY TASK List 


The possibility of constructing a general- 
ized technical skill check list that would scale 
rested, first of all, u 
priate list of the tasks performed in the sev- 
eral ratings and cast 
that would have ess 
ing for all of the fo 
ous Applied Psych 
naval technicians 
specialties provide 
gestions. Consulta 
staff members of 
Training Command. Out of this background 
a list of 28 tasks 
the items was to 
tion in each task, such as “operates” or “cali- 


brates,” to any specific 
equipment. The general directions for the list 


L o be interpreted 
as applying to the “equipment which is en- 


the terminology acceptable, 
a few mi 


Douglas G. Schultz and Arthur I. Siegel 


where on the continuum each task would fall 
in difficulty for the average striker (a trained 
worker with minimum experience in the jM 
specialty). The respondents were provide 
with gummed, prenumbered response labels in 
amounts such that the frequency distribution 
of the numbers on the stickers roughly ap" 
proximated a normal distribution. The P 
liminary task list was administered in PEU 
to 242 enlisted supervisory personnel in a 
four ratings studied. The supervisors were x 
tributed among three pay grades rer 
First Class, and Second Class Petty en 
among 22 squadrons, and across five locations. 


Tur TnunsroNE SCALES 


Using the response data obtained from i 
administration of the preliminary task lis ter- 
the 242 supervisors, the median and E 
quartile range were calculated for each an 
(or task). These provided the scale (S) ang 
deviation (Q) values needed for establis 
a scale according to Thurstone’s metho ie 
equal appearing intervals. The Scag ure, 
plotted in Figure 1. In examining this e 
it should be remembered that the vara à 
forced to respond on a seyenspoitit Son. UM 
to normalize approximately the distri ob- 
of his responses. The lowest scale value an 
tained was 1.52 for Item 7 (Removing) fri, 
the highest was 6.05 for Item 11 (Trypile 
shooting/isolating malfunction (s) in). e 
this is a satisfactory range of S yamon i 
very extreme positions are not repre n 
The Q values are fairly constant over t is 
tire range of S values, although thers ghe! 
slight suggestion that the Q values are 
for the more difficult tasks. tasks) 

In order to select a subset of items ( al^ 
Which would form a Thurstone equal apr ch 
ing interval scale, items were sought Y cho 
would represent all values along the p 
logical "difficulty" continuum, have mini r 
Q values, and sample all technical areas 
formed in the ratings involved. scale’ 

Since it was hoped that two parallel were 
could be constructed, two sets of items a Je 
selected. Because of the single items md it 
at the extremes of the S value distributi? Re- 
was necessary to accept three items—7 ouble 
moving), 9 (Replacing), and 11 oni 
shooting/isolating malfunction(s) in 


2.5 


2.0 


1.5 


0.5 


i o ÉL 
U——— LL M — 
= o 


Applied Shycholcgical Services 


SINYS jouu22l, 3uinspopy 4of sojpog 


Vertical lines connect items se- 
lected for Scaled Lists A and B 
to their respective base lines 


681 


Fic. 1. S and Q values for items in the preliminary task list. 


140 


TABLE 1 


Tasks SELECTED FOR SCALED Lists 


Scaled List A 


. Removing 

. Replacing 

. Postflight inspecting" 

. Periodically inspecting 

4. Inflight inspecting* 

10. Performing preventative maintenance 

28. Instructing others in the inspection of* 

21. Using appropriate test equipment for determining 
malfunctions in the 

16. Analyzing standard circuitry in 

11. Trouble shooting/isolating malfunction (s) in 


tn QD on 


Scaled List B 
7. Removing 
9. Replacing 
17. Employing safety precautions on 
14. Following block diagrams for 
. Knowing relationship of equipment to other related 
12. Calibrating 
. Employing electronic principles involved in main- 
tenance of 


- Trouble Shooting/isolating malfunction(s) in 


Note.—The general directions stated 
e interpreted as applying to the "equipment which is encom- 
rati 


passed by the ng." The number before each task is its item 
number in the preliminary task list. 


^ This task dropped for Guttman scale, 


that each item was to 


both sets, thus introducing common elements 
into any scores based on the two separate 
scales. 

The selected tasks for the twi 
List A and Scaled List B 
the vertical lines in F igure 
Table 1. Although List B contains only 8 
items, they are somewhat more evenly spaced 


along the continuum than are the 10 items of 
List A. 


o scales (Scaled 
) are indicated by 
1 and are listed in 


DEVELOPMENT AND ADMINISTRATION oF THE 
INDIVIDUAL EVALUATION For 


In order to establish 


M 


terms of a specific 
supervised rather 


Douglas G. Schultz and Arthur I. Siegel 


ing 

and (b) they asked whether the man Mr 
rated is checked out as being proficient T «d 
task (i.e., is he capable of doing the tas S 
his own" without direct supervision) E. aa 
than how difficult the task is for the typ 
striker. . did 

The response alternatives available ghe 
rater for each task for each man eva 
were: 


1. Has worked on task and is dae, sf 
2. Has worked on task and is not chec 
3. Has not worked on task 


This individual evaluation form was e 
ministered to the same supervisors as bua to 
liminary task list. Each rater was as J: it 
evaluate a technician he had a P 
was suggested that the technician bes zai 
necessarily to be the best or the an tedii 
he had had under him. A total of 18 
nicians were evaluated. 


THE GUTTMAN SCALES 


-oposed 
The method of scalogram analysis propre 
by Green (1956) was employed. TON es niques 
method, an extension of Guttman's te the i 
places emphasis on a single statistic, pet 
dex of consistency (/), in place of p utt- 
requirements for scalability proposed jucibililY 
man. J relates the obtained reproc — sta 
(which Green computes from aee p 
tistics) to that expected by chance. it the 
gests that J should be .50 or pier in the 
set of items is to be considered a scale 


Guttman sense. Green (1956) writes: "T 
rable 

This criterion appears to give roughly d wi p 

sults to the many criteria used Beretatere 2 pain 0 

helpful to those who desire to create a dic 

scales vs. nonscales (p. 87). 


"n 
However, Green's selection of a specific on 
of I for the break between scales e the 
scales is an arbitrary matter. In ae "pat 
higher the J, the greater the Mp set 
can be placed in the scalability of the pe t0 
One problem was whether to pm of 
scale the items for the technicians in up the 
the four ratings separately or to Babli 
data from all four job specialties. ES € the 
ing the scalability of an item set ov by 
four ratings separately would not eae’ oP 
tablish its scalability over the total grow re 
this point, Guttman (1950) writes: 


Scales for Measuring Technical Skills 


Verse may not form a scale for the total popu- 
lation, but still form a scale for subgroups of 
that population” (p. 83). 

On the other hand, it seemed reasonable to 
gin with an analysis of the entire sample 
Since a finding of scalability at that level 
Would lead to the conclusion that the item set 
RE Scaled within each rating group. In this 
orn Guttman (1950) states: “if a 

e is obtained for a cross section of the 
Population then that same scale pattern neces- 
arily holds for all major subgroups" (p. 83). 
fe eee in the present study, the analysis 
ins uM the response data from all four 
NES laken together, with the thought that 

Scalability was not established at that level, 
freq Uds would then proceed to various 

‘nations of three or two ratings. 

man © Items or tasks to be tested for Gutt- 

an scalability were those included in Scaled 
T A (10 items) and Scaled List B (8 

ms) which had been selected, as described 
appe Pecause they formed Thurstone equal- 
ines nng interval scales. Guttman has not 
Ae d any method for the preliminary se- 
study and ordering of a set of items. In this 
sults? this was accomplished by using the re- 

i the Thurstone analysis. — 

Sorina Green’s method requires dichotomous 
Workey’ the “not checked out and "not 
Mig on” categories of the evaluation form 
i sidered equivalent, as opposed to the 

Wh ed out” response. . " 

iir €n the responses to the 10 items of t e 
Gue Stone Scaled List A were subjected to a 
99 Man analysis, a reproducibility figure of 
Val Pe an I of .42 were obtained. Since the 7 
5o € did not reach Green's critical level of 
fion € next step was to consider dropping 
tati the analysis the technicians of one of the 

85. Review of the data suggested that the 
es me of the TRADEVMEN differed from the 
tho s 258 of the other three ratings more than 
the 9f the three did from one another. But 


ing i2 it seemed apparent that the disturb- 

the s luence was in the items rather than in 

inj, "ble, the next step was to drop some 

bout: ous items, There is some disagreement 

(19g, Ue wisdom of this procedure. Green 
> for example, writes: 


141 


TABLE 2 


RESULTS OF GUTTMAN SCALABILITY ANALYSIS 


STPCLA STPCLB 


` Reproducibility 94 94 


Reproducibility expected by 
chance 
Index of consistency 


tn te 
e OR 
to 
a 


If a set of items does not scale, the possibility exists 
of rejecting one or two poor items, and then achiev- 
ing a scale. Guttman is chary of this procedure, pre- 
ferring to say that the universe is not scalable. How- 
ever, it seems possible to have perfectly good items 
with the wrong form for the Guttman scale. To this 
author, the possibility of rejecting items seems to be 
a necessary part of any method of (attitude) [pa- 
rentheses added] measurement (p. 357).? " 


Torgerson (1958, p. 330) takes the same 
point of view as Green. 

Tasks 3 (Postflight inspecting), 4 (Inflight 
inspecting), and 28 (Instructing others in the 
inspection of) were eliminated from scaled 
List A and the remaining seven items ana- 
lyzed. The results are shown in Table 2. The 
obtained 7 of .57 is high enough to conclude 
that the seven items involved form a scale in 
the Guttman sense. These seven items are re- 
ferred to as the Scaled Technical Proficiency 
Check List, Form A (STPCL A). 

The results from the Guttman analysis of 
the eight items in the Thurstone Scaled List 
B are also presented in Table 2. The J of .57 
indicates that these items also constitute a 
Guttman scale. These eight items are referred 
to as the Scaled Technical Proficiency Check 
List, Form B (STPCL B). 

The value of J for STPCL A is inflated to 
some extent by the fact that, in deciding upon 
which items to drop, the response matrix was 
examined and, therefore, some advantage was 
taken of chance relationships in the data. Al- 
though J for STPCL A should be checked in 
another population sample to determine its 
value more accurately, the fact that STPCL 
B, from which no items were eliminated 
scaled would suggest that the 7 given aj ) 
for List A is not a gross overestimation, 


bove 


2Quoted by permission from Gardner Li 
Handbook of Social Psychology, 1954 bn 
Wesley Publishing Company, Reading, Muss 

S; 


142 


Discussion 


The fact that it was possible in the present 
study to establish scales over four related but 
different naval job specialties has several sig- 


nificant implications. It is apparently possible . 


to generalize a function by divorcing it from 
a specific context and still retain its meaning- 
fulness in different situations. This was the 
central problem faced in writing the items or 
task descriptions. That it is possible to scale 
these items means that the job proficiency of 
the technicians in the several ratings involved 
can be evaluated with reference to a hier- 
archy of job tasks and that if they are checked 
out on one task on a scaled list, it can be as- 
sumed that they are proficient on the tasks 
Which are ranked below that one on the list. 
Application of the technique would seem to 
be of value in understanding the basic struc- 
ture of jobs and the interrelationships among 
them and to have significance for the develop- 
ment of training programs. It would also seem 
to be of value for describing the work per- 
formed by the men in related jobs, the se- 
quence of technical skill development, and for 
job evaluation. 


LI 
SUMMARY AND CONCLUSIONS 


Check lists for use in evaluating task per- 
formance in several related naval job special- 
ties (ratings) were shown to meet the Thur- 
stone and Guttman scalability requirements. 
The Scaled Technical Proficiency Check Lists 
evaluate the status of a technician with ref- 
erence to tasks normally performed by men 
of equivalent pay grade and rating. The lists 
contain only a relatively small number of 
items, so that they are simple and convenient 


Douglas G. Schultz and Arthur I. Siegel 


to use. Yet, because the tasks included M. 
a scale, the score obtained from them can oí 
generalized in meaning to the "universe 
tasks of which they are representative. 


REFERENCES 
ndzey 


Green, B. F. Attitude measurement. In G. Li 


(Ed.), Handbook of social psychology. Cambrid 

Mass.: Addison-Wesley, 1954. Pp. 335-369. using 
Green, B. F. A method of scalogram —-— 

summary statistics. Psychometrika, 1956, : He Tn, 
Gutrman, L. The basis for scalogram av nceton 

Measurement and. prediction. Princeton: 

Univer. Press, 1950. Pp. 60-90. 

Moser, J. N., Five, S. A., & Borie, J. The 
ity of estimated worker requirements. J. ap 
chol., 1960, 44, 156-160. G. Post 

Rricurm, M., SIEGEL, A. L, & ScuuLtz, D. vi an 
training performance criterion devon of * 
application: Development and applicati aviation 
TBCL criterion to the SESR program {or a psy” 
electronics technicians, Wayne, Pa.: App! 
chological Services, 1960. ire he 

SIEGEL, A. I, & Benson, S. Post-training bh 
ance criterion development and applicano mett 
nical performance check list criteria wih require” 
the Thurstone and Guttman scalability | Serv” 
ments. Wayne, Pa.: Applied Psychologi¢ 
ices, 1959, ; p. A 007 

SIEGEL, A. I, RICHLIN, M., & FEDERMAN; „Jizatio” 
parative study of “transfer through Lu tech 
and “transfer through identical element 27- 
nical training. J. appl. Psychol., 1960, 44 " 

Stecet, A. I, Scuurzz, D. G., & Benson i and ot 
training performance criterion digo " jor 
plication: A further study into technica sto" 
ance check list criteria which meet the jo 
and Guttman scalability requirements. Way’ j 
Applied Psychological Services, 1960. asuremt 

Tuurstong, L. L., & Cave, E, J. The me = 929: 
of attitude. Chicago: Univer. Chicago Pre "cali" 

Torcerson, W. S. Theory and methods ? 

New York: Wiley, 1958. 


scalabil- 


pl. Psy- 


for 
h- 


kac] 
$9 


(Received November 23, 1960) 


Journal of Applied Psychology 
1961, Vol. 45, No. 3, 143-149 


INTERESTS OF ENGINEERS RELATED TO TURNOVER, 
SELECTION, AND MANAGEMENT 


J. B. BOYD 


Hydro-Electric Power Commission of Ontario, Toronto 


It is usually assumed that the extent to 
Which a person’s work and work environment 
are compatible with his interests helps deter- 
mine whether or not he stays on the job. The 
Strong Vocational Interest Blank (SVIB) pre- 
dicts occupational stability (Strong, 1955). 

oes it predict job stability? 

If turnover should prove to be predictable, 
It might be reduced either by screening of ap- 
Plicants or alternately by changing conditions 
M the organization so as better to meet the 
needs of those who would otherwise leave. 

ich of these two applications to make 
Should be decided on the basis of some cri- 
terion of what constitutes a desirable popula- 
‘on for the organization. 


METHOD 


tache interests expressed through the SVIB by newly 
engineers who had left the organization after a 
Deriod of service were compared with those ex- 
d by engineers who remained for a longer € 
Vip Jo Comparison was made on the i 
P Scales, then on each of the 400 items, an 
et ly 9n certain groups of items which gave jene 
co, CIE significant. Differences found in the nat 
tho Parison were cross-validated. Further steps bin: 
tol taken to determine whether the men Wio n 
me; "Ve or those who stay are more like high acl -— 
te ee engineers. The precise steps are stated in 
Porting of results, 


Short 


Presse 
tioq 


Subjects 
isp dation Group. From 1953 to 1955, taasis, 
tati “ngineers were employed and enrolled ie s 
tis al training program of 18- to 24-months 5 
Which 26 majority were graduates of the e n A 
in 3 they were hired. As part of a battery zi = 
Sn Ong term validity study they comple i 5 
9n uring their first month. They took the E 
ie, S Understanding that the results would um p? 
Useq ecisions regarding themselves but mig : 
ity With future groups for selection and placemen 
Was 2 relations were discovered. The median EN 
At po With an interquartile range of 23.4 to 2 2 
li the end of March 1956, 30 of these men ha 
lay wice and they will be referred to as VL. The 
EOM Us Stayed on after the cutoff date will be re- 


as VS, 


143 


Cross-Validation Group (C). A second group con- 
sisted of 70 men hired and enrolled in the training 
program in 1956 and 1957. The median age is 24.6 
with an interquartile range from 22.9 to 26.4. They 
also completed the SVIB in their first month. They 
divide into CL, the 13 who left within their first 2 
years of service, and CS, the 57 who stayed longer. 

High Achievement Group (HA). This consisted of 
99 engineers who had progressed relatively quickly 
to senior status in the organization. They were used 
to determine whether those who leave the organiza- 
tion or those who stay are more alike in interests to 
those who presently fill key posts. In this way they 
formed the basis for judging the appropriateness of 
selection vs. an alternative application of the results. 

The selection of HA was based on salary level and 
age, with various kinds of work assignment repre- 
sented. 

Since the salary evaluation plan for engineers was 
revised on the basis of a complete job study in 1956, 
salary level was considered a good measure of the 
value to the organization of each engineer's work. 
The top one-third of the engineers based on salary 
level included 343 men. 

Age limits were added as a criterion for inclusion 
in the group as a means of controlling the rate of 
attainment and furthermore of reducing the age gap 
between HA and the junior group with which it was 
to be compared. Consistent with the other criteria, 
the group was chosen to be as young as possible. 
The upper age limit was 40 for the head office tech- 
nical-administrative subdivision and 50 for the other 
two described in the next paragraph, since promotion 
is not so rapid for them. More than 90% of the 
total were 45 or under. 

Engineers found in different kinds of assignment 
might be expected to show differences in interest pat- 
terns. In view of this the main divergencies should 
be represented in sufficient numbers to test for dif- 
ferences in interest. Two factors were considered im- 
portant in this connection: namely, amount of tech- 
nical content in the work and location. On this basis 
three subdivisions were recognized: technical. 
trative, field; technical-administrative, 
and technical, head office. It was conside 
to have at least 30 in each subdivision 
of about 100. 

These specifications resulted in a list 
neers. They were approached personally with an ex. 
planation of the research, a general indication as i5 
how they had been selected, an explanation a dm 
biasing effect which might occur if self-selection wy 
superimposed, and a request for cooperation, There 
were 99 who complied by completing the SVIB on 


-adminis- 
head Office; 
red desirable 
and a total 


of 106 engi- 


144 


a selí-administered basis during January and Feb- 
ruary of 1959. In view of this response the biasing 


due to the voluntary nature of the proposition was 
considered negligible. 


The median age of HA is 36.5 years with an inter- 
quartile range from 34.2 to 41.0. The technical-ad- 
ministrative, field subdivision was represented by 27 
men; the technical administrative, head office by 41; 
and the technical, head office by 31. 


RESULTS 
Comparison on Occupational Scales 


The SVIB was scored on Strong’s 45 occu- 
pational scales and on Specialization, Occupa- 
tional Level, and Masculinity-Femininity. A 
comparison of mean scores of VS and VL 
yielded 18 differences at p < .05, 11 of them 
being at p < .01. 

The combined predictive value of the in- 
terest scales was obtained by calculating the 
discriminant function. Tt identified correctly 
72% of CL but incorrectly placed 24% of CS 
in CL, 

Since Strong’s scales w 
marily to predict stability in various oc- 
cupations, it seemed likely that some other 
treatment of the 400 items might prove more 


predictive of stability within a. particular or- 
ganization. 


ere designed pri- 


Item Analysis 


. On the basis of item analysis six groups of 
items were developed each of Which differenti- 
ated between VS and VL at p < 01: The 
item groups were called “Mechanical-Tech- 
nical,” “Competitive-Persuasive,” “Manage- 
ment,” “Repetitive Detail,” “Literary and Ar- 
tistic,” and “Attitude to Irritating Character- 


istics in Others.” These six groups were used 
in cross-validation.? 


1 Tested by the ratio of the 
ence to its standard error (Co 
446). These calculations and t| 
“Management” item 
Kurtz, Professional E 
Power Commission of 

*A fuller account 
which proved significant 
analysis, and the devel 
has been deposited wi 


weighted mean differ- 
chrane, 1954, pp. 445- 
he composition of the 
Broup are the work of Ben 


mgineer of the Hydro-Electric 
Ontario. 


; Washington 25, D, C. 
for microfilm or $1.25 for 
payable to: Chief, Photo- 


J. B. Boyd 


TABLE 1 
TUN ALUES 
DISTRIBUTION OF DISCRIMINANT FUNCTION Val 
BaseD on Four Irkw GROUPS 
alidation data) 


pue 
Number 
Discriminant Function ec dii 
Value X 1000 Stayed 
i 5 
90 and higher 20 6 
—10 to +89 28 5 
—110 to —11 a 7 
—111 and lower 3 
1 23 
"Totals 68 


Cross-Validation 


" e 

When the six scores were applied A 
cross-validation group two of them fai "uri 
distinguish between CS and CL. The disc cal: 
nant function of the remaining four ici ale is 
culated and the distribution of these và rd 
shown in Table 1. The rate of HüBee, jo 
out of 10 at the lowest discriminant a the 
range whereas in the other three n viet 
ratio is in the neighborhood of 1 in 4- me | 
of the differentiation at the lower nes dis 
was surprising that the high end of A was 
tribution does not discriminate better. E 
noted that of the five in the top p 956 
left, four were civil engineers in t pa 
class. Moreover, civil engineers account in 
out of the 16 of the 1956 class who I^" in 
the first 2 years. The situation is sho 
Table 2. P thre? 

The Turnover rate for civils is ove" atte” 
times as great as for others. Among the © 


TABLE 2 — 
Comparison or CIVIL AND OTHER Ea 
AS TO RATE OF TURNOVER 


(1956 class) 72 md 
Taea = n 
——— — ove 
Totals Stayed Left TUG 
No. No. No. 

: 48 
Civil Engineers 21 1 10 15 
All Others 35 29 6 " 

Totals 56 40 16 pu 


M D 
" tabl! er 
duplication Service, Library of Congress. Ah «^ 
item weights for the four-item guns ra 
successfully cross-validated, is also provicec- 


Ü 


Turnover, Selection, and Management 


TABLE 3 
DISTRIBUTION or DISCRIMINANT FUNCTION. VALUES 
ExcLupiG Civit ENGINEERS Hirep 1N 1956 
(Cross-validation data) 


—_ 


Discriminant Function 


Stayed Left Turnover 
Value X 1000 V j 


No. No. % 


+90 and higher 20 1 5 
—10 to 4-89 20 Š as 

=110 to —11 16 3 

=111 and lower 1 + 80 
Totals 57 13 


= rate is in the range for succeeding classes 
ua am years. It is obvious that some spe- 
» influence has intervened. It is a matter 

record that civil engineers in the 1956 class 
ad great difficulty in getting placed within 
din. organization. A reduction in the level of 
mae activities meant that fewer were 
ca ed than had been anticipated. This be- 
Wa ne apparent soon after they were hired and 

S known to them. It is therefore reasonable 
© Suppose that a number of these men left 
Cause of the restriction of opportunity al- 
the oe their interests were compatible with 
Work and climate of the organization. — 
the truer comparison is probably obtained if 
sh 1956 civil engineers are omitted as 1s 
Own in Table 3. The difference between the 
(on, S 'Oups by the Mann-Whitney U test 
c e tailed) yields p < 01. Thus it is con- 
-Uded that the four item groups together 
“ntify rea] differences between those who 
and those who stayed.? . 
ome description of the four item groups is 
topriate here, The 13 items of the Literary 
Artistic group include expressive activi- 
» Such as “author” and “artist,” as well as 
DS Which denote either the expressive or 
fs Teciative aspects such as "art, litera- 
the § “music.” There is a dearth of items in 
Dec VIB covering the purely appreciative as- 
in ° Understandable enough in a vocational 
Tüment The only one included in this 


pp 


ties 


though Same trend is apparent in subsequent classes 
the 2-year perio 


las d is incomplete. For the 1958 

ve fe after 18 months) those with scores predictive 

El By Ë have a turnover rate of 50% as compared 

logo /9 for the rest of the group. The figures for 
after 7 months) are 36% and 12%. 


145 


group is the choice between reading a book 
and going to a movie. 

There are 12 items in the Repetitive De- 
tail group. They include clerical detail, pre- 
cision manual work, and similarity of work 
and conditions over a period of time. 

The Mechanical-Technical group is made 
up of 20 items which suggest working with 
tools, applying technical knowledge, solving 
mechanical problems, or invention. 

The Competitive-Persuasive group is the 
longest (42 items) and is somewhat more 
varied. It includes not only a liking for per- 
suasive activities and for being a contestant, 
but also for taking chances and for being so- 
cially prominent. Six of the items included 
with this group also suggest an interest in 
management. Because of special concern over 
this area these six items were included in a 
separate item group along with others denot- 
ing managing and directing activities. This is 
the only case of overlap in the 10 item groups. 
The Management item group, as has already 
been shown, did not differentiate between 
those who stayed and those who left. This 
indicates that it is not the management ele- 
ment in the Competitive-Persuasive group 
that accounts for its discriminating power. 

Data showing the performance of each of 
the four item groups separately will be pre- 
sented in the next section when a comparison 
is made with HA. 


High Achievement Engineers 


To determine whether those who left or 
those who stayed were more like engineers 
who have achieved higher levels in the or. 
ganization, the proportions of HA and of the 
cross-validation group (C) falling at various 
score levels are compared in Tables 4 through 
Yin 

Table 4 shows that a larger proportion of 
HA have literary and artistic interests (low 
score) than is the case among C, i.e., amon 
the group as originally hired. The effect E 
turnover is to increase the gap. The situation 
with repetitive detail as shown in Table 5 is 


#In these tables, which summarize a fulle 
entation deposited with ADI (Document N 
in Footnote 2), the data are 
intervals to highlight the tre 
the statistical tests. 


r pres- 
et 0. 6666 as 

ped into unequa] 
nds. This does not affect 


146 


TABLE 4 
DISLIKE or LITERARY AND ARTISTIC PURSUITS 
(High achievement engineers compared 
with junior engineers) 


J. B. Boyd 


TABLE 6 
IES 
LIKING FOR MECHANICAL AND TECHNICAL ACTIVIT 
(High achievement engineers compared 
with junior engineers) 


Proportionate Distribution 


Proportionate Distribution 


‘Turnover 
Score HA C CS CL Ree Score HA C CS CL = 
2 
+15to +24 y E 38 0 06 +40 to +69 37 27 E p Mit 
0 to +14 50 48 54 23 91 +1010 +39 — 48 — 48 ET » i 
—30 to —1 43 37 28 77 385 —40 to +9 4h — 5 4 : 
gon Su kog 1:00 1,00 Total 1.00 1.00 1.00 1.00 
Number 2 70 m 13 Number 99 70 57 13 E 


Note.—Differences: CS vs. CL, p «.01; HA vs. CS, p «.05. 


similar in that HA has a higher frequency in 
the low scores, which in this case indicate a 
dislike of such activities, and turnover in- 
creases the difference. When the scores from 
these two item groups are combined there is 
a difference in distribution of CS and CL 
which yields p < .001 and between HA and 
CS yielding p < .05.5 

To attempt to reduce turnover by screening 
out applicants having these interests does not 
appear to be a good solution. For, while the 
presence of these interests more frequently in 
the HA group does not prove them to be a 
critical requirement, nevertheless it does pro- 
duce a presumption in the face of which it 


TABLE 5 
LIKING rog REPETITIVE DETAIL 
(High achievement engineers compared 
with junior engineers) 


Proportionate Distribution 


‘Turnover 


Score CL Rate % 


HA C cs 


1 = HA vs. CS, S30) 
Note.—Differences: CS vs. CL, p <.25; HA vs. cs. p 
HA vs. CL, p <.20. 


would be risky to deprive the organization ; 
such men. Therefore the indicated Mec. 
action is to continue to hire these m 
and to endeavor to retain them through e for 
ing conditions so as to give more outle 
these interests. : z 
Tables 6 and 7 point to an opposite S 
clusion though not as definitely. Table 6 $ n2 
HA more frequently having an interest we 
chanical and technical pursuits. The p ship 
of turnover suggests a curvilinear s 3 
but since this does not occur in the validesi 
group, the high rate at the upper score 1 
must tentatively be regarded as torito A 
spite of this irregularity the effect of turne 
on the whole is to decrease slightly the 
ference between HA and CS. qm 
Table 7 shows HA highly concentrate 
TABLE 7 
DISLIKE or Comperirive-Persuasive ACT 
(High achievement engineers compared 
with junior engineers) 


vite 


+20to+49 — 08 3 


14 08 11.1 

—20 to +19 .63 «65 67 61 19.0 

—60 to —21 .29 22 19 31 — 267 
"Total 100 100 1.00 1.00 
Number 99 70 57 13 


Proportionate Distribution 


Note.—Differences: CS vs, CL, p <.10; HA vs, CS, p «.20. 


5 [n this, and in succeeding cases where probability 
values are given, they are based on the Mann-Whit- 
ney U test. Values between CS and CL, having been 
predicted by the validation data, are one-tailed, All 
others are two-tailed. 


e! 

Turno 

ate /? 

Score HA Cc cs G = 

12.5 

+40to +79 06 23 24 46 6T 

Oto+39 46 43 44 36 16.7 

Hto—1 42 2527 24 09 
—110to —51 06 09 05 24 
Total 100 100 100 1.00 
Number 99 70 57 13 


Note.—Difference: CS vs, CL, p «.10. 


Turnover, Selection, and Management 


the central portion of the range, indicating a 
moderately positive or moderately negative 
ah titude toward C ompetitive-Persuasive ac- 
tivities. By contrast C has higher frequency 
at high scores, indicating a dislike, and a 
lower frequency in the slightly negative scores 
Which indicate a mild liking for these activi- 
ties. While it might be desirable to attempt to 
teadjust this balance in future hiring, there 
may also be a case for continuing to include 
& fairly high proportion of those disliking 
ta petitive Parse activities in view of 
lue low turnover. The most obvious con- 
i ia from the table, however, is that there 
Ere justification for hiring in the high 
a "s group, with scores below —50, since, 
Ser ddition to being poor risks for continued 
* rm they do not often appear at the high 
Se tetas e level. This rules these interests 
ms as a critical requirement for high achieve- 
€nt of an engineer in this organization. — 
Though the differences pointed out n 
E 6 and 7 are separately nonsignificant, 
en the scores are combined and the Mann- 
PAifmey U test applied, the difference be- 
(m CS and CL yields p < ‚03. 
ang o Purposes of distinguishing between HA 
Eu other groups Tables 6 and 7 do not 
Mies profitably because of the form of the 
Se ibution, This means that it cannot be as- 
"ted with confidence that turnover decreases 
i © difference between HA and C. However, 
S reasonably certain that it does not in- 
Et b Thus the reason for objecting to 
eed seen in Tables 4 and 5 does not 
os here, and the interests represented in 
in €s 6 and 7 might be combined in screen- 
8 new applicants with some confidence. 
Yent, these results the HA group has been 
et ed as a whole. Comparisons were made 
dotem the three subgroups based on loca- 
i n and degree of technical content, but no 
Bhificont differences in interests were found. 
he specific findings apply, of course, only 
obtain, organization in which the data were 
yer d. The method and instruments, how- 
» might be applied elsewhere. 


S 


DISCUSSION 
m Scales of the SVIB which were found 
e ritferentiate in the validation group may 
elated to factors derived by Strong (1943, 


to 


147 


p. 419). The Musician, Mathematician, Physi- 
cist, Dentist, Engineer scales, associated in 
this study with longer tenure, were all found 
by him to have positive loadings on his Fac- 
tor I, which he was inclined to call *Science," 
following Thurstone. The remaining positive 
differences which were found in this study— 
namely, Printer, Math-Science Teacher, Car- 
penter, Aviator, Policeman, Farmer—corre- 
spond to scales which have positive loadings 
on Strong's Factor III. The scales which in 
this study were found negative, in the sense 
of being associated with short tenure, were 
Lawyer, Advertiser, Real Estate Salesman, 
Sales Manager, Liíe Insurance Salesman, 
President, Occupational Level. With the ex- 
ception of the last two these all have negative 
loadings on Strong's Factor III. This is the 
factor which Strong wondered whether to call 
“Things vs. People” or “Language.” It ap- 
pears then that interest in science and in the 
concrete as compared with interest in people 
or the expression of ideas is associated with 
longer tenure by engineers in this particular 
public utility. 

The item groups in this study can also be 
compared to the results of factor analysis. 
The Mechanical-Technical item group is prob- 
ably loaded with Strong’s concrete factor. 
More recently Cottle (1950) isolated a fac- 
tor he called “detail interest” which appears 
to be close to the repetitive detail of this 
study. However, these studies all used item 
groups which could not be expected to be 
homogeneous, Thurstone and Strong using oc- 
cupational scales and Cottle using even groups 
of occupations. Furthermore Cottle omitted 
the whole of Strong’s Group IV which includes 
all the mechanical occupations of the SVIB. 

A more systematic exploration of interests 
was made by Guilford, Christensen, Bond, and 
Sutton (1954), who constructed a total of 100 
relatively homogeneous 10-item interest ques- 
tionnaires. From among their 28 factors some 
parallels can be drawn with identifiable ele- 
ments in the item groups of this study. These 
parallels are drawn purely on the basis of de- 
scriptive similarity and without seeing the 
actual items used by Guilford et al. They are 
shown in Table 8. From this comparison it 
appears that the Mechanical-Technical item 
group is the most unitary and the others quite 


148 


TABLE 8 
DESCRIPTIVE SIMILARITY OF Irem GROUPS AND IN- 
TEREST FACTORS OF GUILFORD, CHRISTENSEN, 
Bonp, AND SUTTON 


Item Groups Interest Factors 


Literary and Artistic 


appreciation 
expression 


esthetic appreciation 
esthetic expression 
cultural interest 
Repetitive Detail 


clerical clerical 

manual precision precision 

unchanging pattern variety 
Mechanical-Technical mechanical 


Competitive-Persuasive 
competitiveness cultural conformity (includes 
competition, status) 

social initiative (includes 
competition) 


persuasiveness business interest 


ambition (includes per- 
suasion) 


chance-taking adventure vs. security 


expressiveness vs. restraint 
(includes risk taking) 


social prominence need for attention (includes 


recognition, status, exhi- 
bition) 


mixed factorially. To obtain factorially pure 
measures involves much more than a rear- 
rangement of the SVIB items, or even the ad- 
dition of items to round out its coverage of 
interests. The factor analysis of interests sug- 
gests that most items themselves contain con- 
siderable mixture of factors. 

The comparison made in this study be- 
tween junior and senior engineers differing in 
median age by 12 years is considered justi- 
fied in view of the evidence for the stability 
= after college graduation (Strong, 

A word should be said about probability 
levels used in this study, In some comparisons 


limits of .03 and .05 have been the basis for 
conclusions. 


The justification for this is th. 
tical situation, as Burd (1959, 
said, the consequence of Ty, 
have to be taken into accoun 


at in a prac- 

p. 49) has 
pe II error may 
t. Where a man- 


J. B. Boyd 


agement decision must be made, implicitly * 
explicitly, between two alternate courses 
action is surely such a case. J 

It would be desirable to have a seeme 
group of senior engineers to verify the x 
clusions based on comparison with Hr 
However, in the particular organization a 
a group will not appear again for mS a 
years or so, since all who fitted the spect ed 
tions of age and level of attainment were d 
cluded. They constitute then all of the uei 
recently arisen formal leaders amongst M 
neers and in this immediate sense they l^ 
universe rather than a sample. They B E 
course, a temporal cross section of a unive 
made up of successive leadership groups. m 

Of more serious concern than een 
the assumption that the present pee 
group provides the best model for the BN 
A good case can be made for the aaan 
for the past of the leadership of the eee en 
organization. In order to choose a bette 
terion for the future it would be eget 
foresee future developments and to ed 
stand what new demands these would Pis 
upon the leadership. Thus the psu a " 
study, if uncritically used, might resu opti 
perpetuation of a current pattern inapp | (he 
ate for the future. However, the greate ast: 
knowledge of current leadership e pn 
istics the more possible it is to vary t Lem 
tern deliberately in the direction of for 
able needs. 

A great deal of evidence has been tions 
(Argyris, 1957) to show that vaa ali 
are not necessarily adaptive, either ation 
tion to their overall objectives, or in T€ ere 
to the needs of their individual ud 
Studies such as the present one, paruo gr 
if they explore a broad range of i ien they 
istics and determine the extent to whic man 
are critical, could be used to guide the mi A 
agement of an organization in creating 3 
tions which will encourage the La pm 
development of people whose asap a 
are likely to be fulfilled in advancing 
ganization’s purposes. 


cited 


r 


SUMMARY 


est Blan 
Using the Strong Vocational Inter ar to 
four groups of interest items were fo 


Turnover, Selection, and Management 


distinguish engineers who left the service of 
2 particular electrical utility within the first 
in from engineers who stayed longer. 
hate the interests of engineers who had 
Short: 2 senior responsibility in relatively 
"y ime confirmed the suspicion that some 
Sin NM who were leaving might be those 
hus mil similar in interest to the leaders. 
could m likely to leave the organization 
grou r separated into two identifiable 
safe [3 One group it is considered relatively 
O screen out at the time of application. 
€ other group should be encouraged to stay 
gd to change conditions in the organi- 
their į SO as to provide better satisfaction for 
ig It is suggested that, as well as 
need a suitable people, an organization may 
of th $ adapt itself so as to satisfy the needs 
€ kind of people it requires. 


149 


REFERENCES 


Ancvnis, C. Personality and organization. New York: 
Harper, 1957. 

Burp, F. W. Clinical psychology and the true psy- 
chologist. Ontario Psychol. Ass. Quart., 1959 12 
47-58. v 

Cocurane, W. G. Some methods for strengthening 
the common chi square tests. Biometrics, 1954, 10 
417-451. MTS 

Corrie, W. C. A factorial study of the multiphasic 
Strong, Kuder, and Bell inventories using a popu- 
lation of adult males. Psychometrika, 1950, 15 
25-47. ibis 

Guirrogp, J. P., CHRISTENSEN, P. R, Boxp, N. A 
Jn, & Sutton, M. A. A factor analysis study e 
human interests. Psychol. Monogr, 1954, 68(4 
Whole No. 375). : 

Srronc, E. K, Jn. Vocational interests of men and 
women. Stanford: Stanford Univer. Press, 1943 

Srronc, E. K., Jn. Vocational interests 18 years after 
college. Minneapolis: Univer. Minnesota Brew 
1955. i 

(Received March 30, 1960) 


al of Applied Psychology 
ie Vol. 45, No. 3, 150-155 


A COMPARISON OF INDIVIDUALS VERSUS GROUPS 


IN JUDGING 


VICTOR B. CLINE AND 


PERSONALITY' 


JAMES M. RICHARDS, Jn. 


University of Utah 


As a practical necessity men are continually 
required to subjectively judge, assess, and 
evaluate their associates. Frequently in the 
military or in industry this is a prerequisite 
in initial employment, promotion, etc. There 
have been various approaches to the quantifi- 
cation of subjective judgments of which per- 
haps the most common have been rating pro- 
cedures. Since this type of judgment and the 
decisions or courses of action which result 
therefrom have so many far reaching implica- 
tions, any research which might further con- 
tribute to our knowledge in this area should 
be of considerable intrinsic importance. The 
purpose of the present study was to (a) de- 
termine whether individuals or groups are 
more likely to be accurate in making social 
judgments (i.e., “predictions” of the behavior 
and personality of other individuals), and (5) 
at the same time compare different types of 
group judgment. These judgments were made 
on instruments similar to two different kinds 
of rating scales commonly used in applied 
settings. 

The rationale of this experiment grew out 
of the recent survey of studies comparing 
group performance and individual perform- 
ance made by Lorge, Fox, Davitz, and Bren- 
ner (1958). The general conclusion of this 
Survey was that a group, on almost any task, 
will perform better than a typical individual, 
but not necessarily better than a Superior in- 
dividual on the task in question. This finding 
is true whether the "group performance" is 
made by a genuine group or is merely a sta- 
tistical combination of several independent in- 
dividual performances, An unresolved ques- 
tion is the degree to which these findings can 
be attributed to a reduction in the variability 
of the group performance. 

The trend of the studies cited in this sur- 


l'This research supported under Contract Nonr 


1288(04), Group Psychology Branch, Office of Naval 
Research. 


vey suggested the hypothesis to be tested in 
this experiment. This hypothesis is: - 

The accuracy of predictions (about the be 
havior of other persons) made by a mu 
persons arriving at a consensus! predicta 
through group discussion will be a pe 
greater than the average accuracy of the inn 
dictions made by the individuals ud 
the group. The average accuracy of the Pus 
dictions made by the individuals qup. 
the group will also be significantly: dass we 
the accuracy of an “artificial group” (c Be 
posed of pooled independent judgments á 
each item) and also /ess than the accuracy ia 
prediction of the best individual among 
individuals composing the group. 

A secondary question relates to the 
ence or absence of a consistent pattern = 
periority in accuracy among predictions ipa 
by best individual judges, consensus ij 4 ioe 
and artificial groups composed of poole 
dependent judgments for each item. 


pres- 
f su- 
ade 


METHOD 
Measures 


fe- 
The subjects were 186 students, both male y the 
male, in the introductory psychology asses e 
University of Utah in the fall of 1959. The P iews 0 
involved the presentation of six filmed interv sou 
“standard others.” These were photographed 2 men 
and color, and were conducted by an S 2 fairly 
ber of the university theatre staff, who aske ;valenc? 
standard series of questions (to insure e r- 
over interviews) probing the following aa 
sonal values, personality strengths and lene se 
reaction to the interview, hobbies and activities 
conception, and temper. ro^ 
After a filmed interview had been showr bo e^ 
jector would be stopped and the subject-Ju E 
quired to fill out paper-pencil judging instr show? 
Following this another interview would be selec” 
and so forth. Details of the development M "E 
tion of these films, the experimental procedu7 «d 
volved, and certain underlying methodologic: else” 
theoretical considerations have been publishe 
where (Cline & Richards, 1958, 1960a). ts w re 
In this study, two prediction instrumen Y 
used. The first of these was the Adjective Che 


150 


esse? 


Individual versus Group Personality Judgments 


Ms Which required the subject to determine 
checked a pair of adjectives the interviewee had 
item j as being descriptive of himself. A sample 
Is; 
14. — — —. (a) resourceful 

(b) cheerful 


E Were 20 such pairs for cach of the six films 
the hin total of 120. The score on the ACL was 
forced-ch er correct. Thus the ACL is similar to @ 
Choice rating procedure. 
Rae instrument used was the Belief-Values 
te Y (BVI). On this instrument the subject 
Mewes here to determine (predict) how the inter- 
With e responded to a Likert-type scale dealing 
iew. H oip beliefs. During the course of the inter- 
Questio, * person in the film had been asked direct 
ns in this area, A sample item is: 


I feel quite sure God does not exist. 


—— —— (1) Strongly agree 
= (2) Agree 
——— (3) Neither agree nor disagree 
7—— (4) Disagree 
(5) Strongly disagree 


Thu 
5 the BVI is comparable to a graphic rating PFo- 


cedure 


Me Were 12 such items for each film OT inter- 
fion wore different scores based on a recent modi- 
Mic by Cline and Richards (1960b) of an ana- 
Compare dure suggested by Cronbach (1955) were 
Using Id from judges’ responses to this instrument 
Puter à program developed for the IBM 650 com- 
sed he first of these was a total score, which was 
Using e the average of the squared aarte a 
Don; € one- to five-point scale) between pice 
Actual 55 by each judge for each interviewee, an 
Score, *SPOnses of each interviewee. This is an error 
jo other’ in order to make these scores comparable 
Convert Scores used in this study, the scores b 
Score a to accuracy scores through à apaan 
ang E transformation, setting the mean equal to 
tandard deviation equal to 10. 
tha Gone two BVI scores are 
CUragy n bach (1955) has called 
Judge | his measures the 


components of 
“Stereotype Ac- 
to which each 


Gage, (he degree to which the me 
S eed interviewees) predict 
to actual item means. - 
Wh j this study are the (a) correlation. between 
s, Ee's predicted item means and Das ae 
pices. ORV isher's z, and the - 
td ped Do e T coin ns. Cronbach has 
to be the two parame- 


hag Stereotype Accuracy when the criterion 1s 


tiong “OStant, and they permit independent evalua- 
ili the effect of grouping on accuracy and on 
Tyee lY of prediction in this study. T 

terpe 
Which tsonal Accuracy. This repres 
ges accurately predict the responses of in- 


151 


terviewees to individual items, and involves mainly 
the degree to which judges correctly order the inter- 
viewees in terms of their overall degree of “religi- 
osity.” It, therefore, is the best measure of the kind 
of accuracy that is the main concern in most institu- 
tional rating situations. Interpersonal Accuracy, like 
Stereotype Accuracy, has two independent parame- 
ters, a correlation term expressed in terms of Fisher’s 
s, and a variance term, thus permitting independent 
evaluation of accuracy and variability. The correla- 
tion score is computed by determining the correlation 
between each judge’s predicted values and the corre- 
sponding actual values on individual items, convert- 
ing to Fisher’s s and averaging across items. The 
variance score is computed by determining the vari- 
ance of each judge’s predicted scores on individual 
items and averaging across items. 


Procedure 


The 186 subjects in this experiment were divided 
into 62 three-person groups. The division was made 
at the time the experiment was conducted, and most 
groups consisted of three persons seated next to each 
other in the experimental room, Group composition 
in terms of sex of group members was roughly ran- 
dom. The subjects saw each film and first completed 
the judging instruments independently. They then 
joined together in group discussion fashion and pro- 
ceeded to arrive at a consensus judgment for the 
items on the judging instruments without referring 
back to, or looking at, their earlier independent judg- 
ments. 

The “artificial group” judgment (or, in other words, 
pooled independent judgments for each item) was 
derived from the individual judgments of the group 
members. Thus, on the ACL, the artificial group 
judgment was determined on the basis of a “ma- 
jority vote" of the judges on each item (by inspect- 
ing their individual judging protocols). On the BVI. 
it was calculated by determining the average of the 
values predicted by the three judges for each inter- 
viewee on each item. It is important to emphasize 
that this artificial group is not a group in the psy- 
chological sense, but only a statistical combination 
of the original independent judgments for each item 

The "average accuracy of individuals composing 
the group" was, of course, obtained by computing 
the mean of the total accuracy scores of the three 
individuals who made up each group. It is most im- 
portant to note that this is zot the same thing as the 
artificial group procedure where it was the actual 
item by item predictions of the three group mem- 
bers that were averaged rather than their total ac- 
curacy scores. 

The “best judge” in each group was selected on th 
basis of his accuracy scores. In interpreting the on 
sults of this study, therefore, it is important to note 
that this selection was done on an dftersthie cee 
basis, thus maximizing accuracy scores for this Con 
dition by capitalizing on chance. It would, theref, "d 
be impossible for a best judge selected in advan "d 
obtain a higher score than this, and such a best fide 


152 


Victor B. Cline and James M. Richards, Jr. 


TABLE 1 
MEANS AND STANDARD DEVIATIONS OF JUDGMENT SCORES 


Three- Artificial Group 
o d Person Derived by Pooling 
Composihig Best Group Three mape 
the Group Judge Consensus Judgmer 
wae Check List: oor 101.66 102.52 us 
z [ 3.95 dd 
D 3.51 3.91 
Belief-Values Inventory : 
23 43.29 53.92 49.47 E 
1 11.16 à 
"i 8.61 831 
Stereotype Accuracy z T 144 128 14r 
28 37 ES d 
ë 5 
Stereotype Accuracy Variance 3s ái 31 = 
e 44 22 15 ` 
Interpersonal Accuracy z 98 
Žž .90 1.01 1.00 H 5 
" 12 4 A6 Bis 
Interpersonal Accuracy Variance 91 
€ 1.09 1.06 1.06 js 
c .23 28 E Ai —— 
would, in fact, probably score somewhat lower, since RESULTS 
some error would be involved in any advance selec- 


tion. The best judges were 
the ACL and the BVI and therefore were not neces- 
sarily the same person on the two different instru- 
ments. On the BVI, however, the best judges, se- 
lected on the basis of total score, were also used as 
best judges in making the comparisons involving the 
Other scores derived from this instrument, 


selected independently for 


each 
core 
ores 


The mean and standard deviations oe 
judgment procedure on each agi Ro 
are presented in Table 1. In Table 1, à ^ 
are accuracy scores. Since total score " sudg" 
is based on error score, in Table 1 this ee 
ment score is transformed to a standard 5 


TABLE 2 


RESULTS or OVERALL F TESTS FOR JUDGMENT SCORES 


Judgment Score 


Between-Variance 


Within-Variance 


(df=3) (df=244) á 
Adjective Check List : 
Total 451.82 13.82 32.62% 
Belief-Values Inventory: ** 
Inventory Total 1423.02 84.33 16:81". 
Stereotype Accuracy z .8633 .1432 6.03" 
Stereotype Accuracy Variance .1333 .0282 4.72" 
Interpersonal Accuracy z -1633 .0213 1.67 
Interpersonal Accuracy Variance .3900 0754 5.17* —— 


Individual versus Group Personality Judgments 153 


A 
TABLE 3 
Tests ron SIGNIFICANCE OF DIFFERENCE BETWEEN INDIVIDUAL MEANS FOR EACH JUDGMENT SCORE 
| p 
Average of | Average of | Average of „Group 
| Individual Individual] Individual |Best Judge| Best Judge | Consensus 
vs. Best. | vs. Group |vs. Artificial] vs. Group |vs. Artificial) vs. Arti- 
Judgment Score Judge |Consensus | Group Consensus| Group ficial Group 
Adjectiv nee Miu 
€ Check Li 
Tota] "n azor — 525 — 605 —— 86 1.66* 80 
Heliet.y. > 
-Values Iny - ] 
Total VENE 10.63** — G18** — 98 — 445^ 1.05 3.40* 
Stereotype Accuracy 3 25** 09 22" .16* 03 -13* 
x d . E: n & ++ je 
tereotype Accuracy Variance 05 O4 05 a p s 
Merpersone] Accuracy z A1** .10** Ja p^ : i [p^ 
Merpersona] Accuracy Variance 03 .03 A8 Á 15 AS 
N 2 F T^ »s without regard to the direction of the difference 
MIT E in this table represent the absolute difference between groups without rega ie diferénce 
EUM 


vatribution With mean — 50 and standard de- 
Aion = 10, 
un a first step in the statistical analysis 
for Pes data, overall F tests were calculate 
e ach of the judgment scores separately. 
Table puts of this analysis are presented in 
Was 2. No test for homogeneity of variance 
Th; Made before calculation of these F tests. 
Cent Procedure was followed because the re- 
that Work of Boneau (1960) strongly suggests 
Beneit, is not significantly affected by hetero- 
lenti of variance if the sample pee 
these Cal and relatively large, ie, 20. Bot sl 
M Conditions hold in the present study. 
known that available tests for homo- 


Rene; 

eit 
hy 
à 


er 9f variance are affected too much hg 
b Variables than that involved in the nu 
hay sis to justify their use prior to an 
IS of variance (Box, 1953). 
Mint all of the F tests in Table 2 are sig- 
dence t at or beyond the .01 level of confi- 
een & test for significance of difference be- 
Was + individual means was made. This test 
los, ade using the multiple range test (Li, 
Droge p. 238), which is the most appropriate 
Maye Ure known to the experimenters [or 
Ne a “post-mortem” type comparisons be- 
Individual means after an overall F 
been made. Briefly, the multiple 
est involves computing 4 value which 
how large the difference between 
a tone must be in order to be significant 
fed level, and then comparing the ob- 


tained difference to this value. Results of this 
analysis are summarized in Table 3. 


DISCUSSION 


On each of the four accuracy measures, the 
best judge and both group judgments are sig- 
nificantly superior to the average of the indi- 
viduals composing the group. Thus, the ma- 
jor hypothesis of this experiment is confirmed. 
There is no consistent pattern of significant 
differences among the first three procedures 
mentioned above. As would be expected, on 
the two scores representing the amount of 
variability in predictions, the artificial group 
mean tends to be lower than the means of the 
other three procedures. This tendency is sig- 
nificant, however, only for the Interpersonal 
Accuracy variance score. It is somewhat sur- 
prising to find that the artificial group (or 
pooled independent judgments of items) is 
superior to the best judge on the ACL. The 
interpretation of this finding seems to be that 
if both other judges disagree with the best 
judge, they are more likely to be right than is 
the best judge. If, on the other hand, only 
one of the other judges disagrees with the best 
judge, he is more likely to be wrong than is 
the best judge. 

This study clearly implies that satisfactory 
ratings are least likely to be obtained from a 
single unselected individual. In exploring fur- 
ther implications of these results for an op- 
erational rating set up, several other consid. 


154 


erations enter in. The first of these is that 
typically the best judge would be difficult to 
select on an a priori basis, and (because of 
selection error) best judges selected a priori 
would probably score lower than the best 
judges used in this study. Since each of the 
group procedures produces results roughly 
equivalent to the best judge selected on an 
after-the-fact basis, an extensive (and expen- 
sive) effort to identify best judges and use 
them as raters would appear to be unneces- 
sary. 

The second consideration involved in apply- 
ing these results is that by far the most time 
in this experiment was consumed in arriving 
at consensus judgments through group discus- 
Sion, a finding which one would certainly ex- 
pect to generalize to other situations. Since 
the artificial group (or pooled independent 
judgments of items) procedure produced re- 
Sults as good as or better than the results 
produced by the Consensus judgment, and re- 
quired much less time, it would appear to be 
most appropriate when accuracy and time are 
both considered. Thus, the best procedure for 
Using ratings in many applied situations 
would be to obtain Several independent rat- 
ings from different raters for each ratee, and 
then combine 
à single rating, It should be noted, however, 
that the superio: 
terms of time 


many relativel 


by the experimental Procedure used in this 
study. 


terviews). 
ratings on 


Victor B. Cline and James M. Richards, Jr. 


and the Interpersonal Accuracy correlation 
term of the BVI that accuracy is incre. 
through grouping independent of a i. 
in variability (see Table 1). Unlike the ots 
results of this experiment, this would DA 
necessarily be expected on the basis of p 
ous studies comparing group and indivi a 
performance, although it certainly is —-— 
ent with previous studies. The second ee 
addition is related to the current WM 
in the "interpersonal perception? litera e 
over the relative merits of various E 5). 
types of accuracy scores (Cronbach, 1 

In the current study the total score ic im 
ACL, the total score on the BVI, "a Ac 
Stereotype Accuracy and Interpersona’ m 
curacy correlation terms all gave cons a 
results and, more important, results Ve 
make sense in terms of previous research anc 
paring group and individual perform 
This would lead one to hope that the on 
pretations of different types of accuracy set 
have more in common than previous 10 
gators have thought. 


SUMMARY 


here 
The purpose of the research reported P 1 
was to compare accuracy of hys 1als 
perception" or judging ability of d y re" 
versus groups. This was accomplishe redi 
quiring the subjects (or judges) to Piy in 
the responses of six persons (seen p tions 
Sound color movies of interview situa self- 
to questionnaires dealing with values an pes? 
concepts. In the present experiment, wh? 
Movies were viewed by 186 individual " of 
were also divided into 62 groups cape the 
three persons each. The procedure was id rst 
three subjects composing the group, 4” how 
individually and independently “predios oA 
the person interviewed in the movie Jat" 
sponded to the questionnaires, and then, co 
through group discussion to arrive Pi {0 
Sensus prediction (without referring. "n 
their earlier predictions). A compariso ge of 
made of the accuracy of (a) the ayer ent 
the total accuracy scores of the vier k p$ 
predictions made by each of the three Pons 
composing the group, (5) the group compel”! 
Predictions, (c) the accuracy of an an, a 
group” derived through a statistical CO 


— 


Individual versus Group Personality Judgments 


dak 4 a independent item by item predic- 
-€ these same three persons, and (d) the 
a of the “best judge” from each group. 
curacy ee indicated that the average ac- 
or lo the individuals is significantly in- 
ut that any of the other three procedures, 
Stperiorit there is no consistent pattern of 
cedures in prediction among these pro- 
Seltin 3 hese results suggest that in applied 
pene ratings are least likely to be 

led from unselected individuals. How- 
Eu of time, money, and procedural 
a les the artificial group procedure or, 
tendensi Words, the pooling of several inde- 
t0 be th judges’ ratings (by items) appeared 

€ most satisfactory procedure. 


Ever 
dif 


" REFERENCES 
(COND, C 


tions ung A. The effect of violations of assump- 
57, oe the £ test. Psychol. Bull, 1960, 


155 


Box, G. E. P. Non-normality and tests on variances. 
Biometrika, 1953, 40, 318-335. 

Cune, V. B, & Rrcnamps, J. M:, Jm. Variables re- 
lated to accuracy in interpersonal perception. An- 
nual Report, November 1958, University of Utah, 
Contract Nonr 171-146, Group Psychology Branch 
Office of Naval Research. t 

Crime, V. B., & Rrcuanps, J. M., Jr. Accuracy of in- 
terpersonal perception: A general trait? J. abnorm. 
soc. Psychol., 1960, 60, 1-7. (a) 

Curve, V. B, & Ricwarps, J. M., Jr. Variables re- 
lated to accuracy in interpersonal perception. An- 
nual Report, 1960, University of Utah, Contract 
Nonr 171-146, Group Psychology Branch, Office 
of Naval Research. (b) 

Cronsacu, L. J. Processes affecting scores on “un- 
derstanding of others” and assumed similarity, Psy- 
chol. Bull., 1955, 52, 177-193. 

Lt, J. C. R. Introduction to statistical inference, Ann 
Arbor: Edwards, 1957. 

Loree, I., Fox, D., Davirz, J., & BRENNER, M. A sur- 
vey of studies contrasting the quality of group per- 
formance and individual performance, 1920-1957 
Psychol. Bull., 1958, 55, 337-372. E 


(Received May 20, 1960) 


al of Applied Psychology 
rrr Vol. 45, No. 3, 156-160 


DESIRABLE ATTRIBUTES OF WORK: 


FOUR LEVELS OF MANAGEMENT DESCRIBE THEIR JOB 
ENVIRONMENTS 


HJALMAR ROSEN 


University of Illinois 


Although there has been much valuable im- 
pressionistic material presented in the quasi- 
professional literature, particularly with re- 
Spect to top echelon management (Argyris, 
1954; Smith, 1955; Warner & Abegglen, 
1955; Whyte, 1954) one is provided with 
little in the way of firsthand reports. The 
purpose of this study, then, is to study how 
various managerial echelons describe their job 
environments with respect to conditions of 
Work they deem important. Do differences 
exist among the various classes of manage- 
ment personnel within the hierarchy? 

In an earlier analysis (Rosen & Weaver, 
1960) it was pointed out that there was little 
to distinguish among four classes of man- 
agerial personnel studied with regard to what 
they considered to be important conditions of 
work. Utilizing the same “desired” conditions, 
to what extent do members of the four man- 
agerial classes perceive those conditions to 
characterize their environment? From casual 
observation of the industrial Scene, one might 
assume that job conditions would vary con- 
siderably among as diverse managerial classes 
as first line supervision, time and motion en- 
gineers, general production foremen, and the 
vice president in charge of production. Cer- 
tainly the physical environments differ con- 
siderably, but in terms of commonly desired 
job characteristics, would the differences be 
as great? 


METHOD 
Sample 


The subjects of this 
managerial personnel ( 
ofa moderately sized, 
ing concern in a ratl 


midwestern, urban 
center. The "managers? 


Were divided into four cat- 
nizational charts, Status, and 


` ell into the top manager cat- 
egory, 36 into a middl 


into a staff specialist cat 
supervision category. T. 
cluded such job titles 


; uch 
manager, etc. Middle management geni jon 
positions as accounting supervisory and fenes. 
men. Staff specialists included all TORE ath ls 
technical personnel such as time study men, 


nd 
s y etc. A 

engineers, personnel specialists, ipaq caen jn- 
first line supervision, as the class title ialisls 


eci: 
cluded all departmental foremen. The sap a 
Were considered members of management, niall upon 
words of the plant manager "are the reser 


i eri ts." 
which we draw for higher managerial pos 


Method 


k wi 

Twenty-four desirable conditions of ee 
four major areas (Rosen & Weaver, poa ee 
sented to the subjects (see Table 1). ' dicate 
asked to use seven response categories to "ditions 
extent to which they perceived these con respo”? 
exist within their job environments. ind “Neve! 
categories ranged from “Always” throug behavior? 
and in addition were "spelled out" in di 
terms. take? 
"Four basic analyses of the data were hei H m 
First, means and standard deviations o vere © T 
for each of the four managerial classes Ti possible 
puted. Second, sign tests were applied Pie 24 iter 
managerial group pairings in terms 0 à f 
means. Third, ¢ tests were run for eac pairing 
items on all possible managerial oes possib 
Finally, Pearson r correlations between » conditio? 
pairings of groups across all 24 descriptive 
sumta Spec 


jthin 
pre 
wen 

he 


stjo! 
efinit f 

1 The following responses and response d point ai 
were provided: (0) "Always": "From yo cant 


view, this always exists in your job. 


view, this exists in your job a large Lv jo 
Although you might be able to think o Q ‘Miis 
you would be hard pressed to do so. p views e 
Oiten Than Not": *From your point ie you p 
exists in your job more often than (o p n 
quite easily think of times when it did ie we 
“About as Often as Not”: “From your poy F ( Ji 
this exists in your job about as often i o us 
"Less Often Than Not": “From your pos e 
this exists in your job less often than Um pis 
quite easily think of times when it does. (view: y 
Share of the Time": *From your point p» ume sp 
exists in your job only a small share of : s wi T9 
though you might be able to think of C do so." Ct 
did occur, you would be hard pressed to t ts in iet 
"From your point of view, this never bed it d 
job. You cannot think of any instances W 


156 


FU 


TABLE 1 


I TWENTY-FOUR CONDITIONS OF WORK 
UsED as STIMULI ITEMS 


An ; 
ĉa 1. Relations with Superiors 


J. " 
Having the opportunity to talk over problems 


With my superior 
3 E whose order to follow , ; 
ing under a superior who explains any 
q sis he makes 
"Ing superiors who will help me out, but not 
E. sen over when I get into a jam i 
! ing under superiors who judge me solely in 
terms of merit 
Working under superiors who delegate as much 
, of their authority as possible 
Nowing where T stand as far as my superiors are 
Concerned 
orking under men who attempt to develop 
6 k eir subordinates 
* Working under superiors who recognize the 
10 Problems involved in my work 
Orking under a man who will take over and do 
the job for me when I get into a jam 


Area ; 
i 2i Relations with Company 
i sing knowledge of plant plans that affect me 

lo. qo my job E 
Orking in a plant where the responsibilities of 

3, every supervisor are clearly defined . 
orking for a company that stresses experience 

l4, Mere than education for promotion 

s yon in a plant that is operated efficiently 
orking in a plant that has clear cut, long range 
Objectives 


ea 3 
16 i Relations with Peers 
` Aving the other managers at my level recognize 
A * importance of my work y. ds 
Ving knowledge of what others are doing inas- 
18 Much as it may affect me and my job 


2: 
BO: mutual cooperation among the man 
lo, My level 
orking with fellow managers who recognize the 


agers 


39, ,, Problems involved in my work 
orking with fellow managers who will help me 
At Mt when T get into a jam 


Ron. 
B Decision Making + Implementation 

as radi in pla olicy 
2 ak the opportunity to share in plant policy 
2, Bej king decisions 
2 ng Consulted before decisions 
3. l Oncern men and my department 


avi : : 
Mang) Sufficient authority for the 

34, ,, 9E me 
E 


are made which 
job expected 


vi i y yone 
ng management meeting where every 


ave his say 


Desirable Attributes of Work 


157 
RESULTS AND DISCUSSION 


Reviewing Table 2, in general it appears 
that the environments of the four levels of 
management studied are relatively rich in de- 
sired conditions of work. More specifically, all 
levels stated that in area of Relations with 
Superiors that: superiors were willing to talk 
over problems with them (Item 1); they 
knew whose orders to follow (Item 2); supe- 
riors recognized problems involved in the 
work of their subordinates (Item 9); and the 
superior would help out a subordinate who 
was in need (Item 10). There was no general 
agreement in terms of high incidence of oc- 
currence in Area 2, Relations with Company, 
although all levels with the exception of staff 
specialists indicated that they perceived the 
plant to be operated efficiently (Item 14) 
with clear cut, long range objectives (Item 
15). Area 3, Relations with Peers had two 
areas of reported high incidence for all levels: 
cooperative (Item 18) and helpful (Item 20) 
fellow managers. Area 4, Decision Making 
and Implementation, had no items indicated 
having high incidence for all levels. It should 
be pointed out that this area contained the 
only cases where any management level indi- 
cated conditions of work being less rather 
than more characteristic of their job environ- 
ments; specifically with regard sharing in 
policy making decisions (Item 21) and hav- 
ing meetings where everyone can have his say 
(Item 24). 

These results suggest that the work envi- 
ronments of the four levels of management 
studied, although differing along a physical 
continuum sharply, provided for the social 
and psychological needs that were manifest, 
Moreover, there is some justification for 
pointing up the area of decision making and 
implementation as the one in which desirable 
conditions of work were least prevalent. 

A sign test analysis of the data indicated 
that on 21 of the 24 items the means for top 
management exceeded those of staff special- 
ists (.01 level) and on 18 items they exceeded 
the means of first line supervision (.05 level), 
Middle management's item means exceeded 
those of staff specialists in 24 cases (.01 
level) and those of first line supervision in 23 
(.01 level). Top and middle management 


158 Hjalmar Rosen 


TABLE 2 
MEANS AND STANDARD DEVIATIONS RE EXTENT TO Watch Work CONDITIONS 


CHARACTERIZE JOB ENVIRONMENTS OF Four LEVELS OF MANAGEMENT 


Top Mgt. Middle Mgt. Staff Spec. First Line 
Item Mean SD Mean SD Mean SD Mean SD 
Area 1. Relations with Superiors -- 
1 1.000 .926 .583 1.011 .861 1.036 .804 p^ 
2 .420 .502 .500 .687 .692 .803 .826 Bi 1 
$ .850 .367 1.333 .800 1.908 1.378 1.608 1285 
4 1.850 1.649 1.000 1.054 1.492 1.416 1.565 rn 
5 1.420 917 1.555 1.092 1.630 1.410 2.086 1416 
6 2.280 1.394 1.333 1.313 2.046 1.534 1.673 Li 
7 1.420 .917 1.694 1.330 2.215 1.422 2.043 1245 
8 1.000 756 1.194 811 1.676 1.166 1.804 ian 
9 1.000 756 972 -866 1.184 876 1.326 1676 
10 1.000 1.773 1.166 1.788 1.184 1.805 1.435 E 
Area 2. Relations with Company -— 
11 1.280 0 1.527 1.055 2.492 1.551 1.891 "ais 
12 1.570 .906 1.611 1.370 2.092 1.376 1.782 Ioas 
13 2.830 .648 2.333 1.599 2.923 1.610 2.804 ku 
14 1.000 -000 1.083 1.051 1.507 964 1.086 “937 
15 1.000 756 .805 .938 1.800 1.146 1.239 n 
Area 3. Relations with Peers ‘ek 
16 1,000 535 1.500 726 1.707 1.390 1.434 DH 
17 1.850 1.137 1.277 -769 2.184 1.392 1.847 O73 
18 .850 .367 1.333 .882 1.276 414 1.478 141 
19 1.420 1.061 1.583 830 1.507 1.112 1.695 " 
20 -710 4.58 1.027 127 1.384 1.092 1.478 x 
Area 4, Decision Making + Implementation "T 
21 1.710 .710 2.583 1.422 4.030 1.824 3.608 Y s 
22 1.710 .710 1.444 1.092 2.676 1.629 2.108 Y 
23 1.570 1.295 916 954 1.723 1.365 1.043 n" 
24 3.000 1.604 2.305 1.664 3.057 1.842 2.565 s 


were undifferentiated in this analysis, 14 and 
10, respectively, and staff specialists and first 
line supervision were undifferentiated, 9 and 
14, respectively. Essentially, then, in terms of 
perceived degree of occurrence across all job 
conditions, there seem to be, literally, two sig- 
nificantly different groups made up of sub- 
classes that were undifferentiated. 

Regardless of the preliminary nature of 
such analyses, the findings are in keeping 
with the current belief that the higher one 
goes in the management hierarchy, the greater 
are the rewards of the environment. It should 
be noted, however, that the findings suggest a 
dichotomization rather than an orderly prog- 


low 
ress. It appears that advancement from i” 
echelons to high, at least within this uo k of 
tion, would bring about greater ayer ce 
desired conditions of work, but not à assi" 
ment within either the low or the high Y. rela” 
cations. Apparently, until one ache n 
tively high supervisory status bee job 
organization, major increases 1n desi 4 
conditions do not tend to occur. eal! 
From Table 3, it is apparent that € did 
of top- and middle-management person gy 9 
not differ from one another significan ni 
any item. Top management differed i jte 
cantly from first line supervision on t im^ 
and from staff specialists in one. This 


Desirable Attributes of Work 


uenty of significantly greater occurrence 

hs erable job conditions in this case may be 

E Y a statistical artifact, however, due to 

a fo phoned error resulting from the N 
i en in the top management level. Middle 

1 fors item means were significantly 
ases | an those of the staff specialists in 11 

| including all conditions relating to Area 
Rie on Making and Implementation. 
Bis; means Were significantly greater than 
amon ne supervision in seven cases, scattered 

* E all areas. First line supervision's item 
En eed those of staff specialists in five 

of rep 2C Of the cases fell within the area 

ationships with supervision, however. 

agement Tesults suggest that the lower man- 

Ib eny; echelons have the least rewarding 
Vironments, but perhaps more impor- 


TABLE 3 
T DIFFERENCES BETWEEN MANAGERIAL 
IN Terms or tHe ExTENT TO Wich WORK 
TIONS WERE DESCRIBED AS CHARACTERIZING 
Respective Jon ENVIRONMENTS AS 


Sx _ Evmencen ny ¢ Scores 


Item i 


— Levels 
pa ERR n 


Sisi AÑ 
EVELs 


Coup; 
T 


i Middle Mgt. Staff Spec. — 2282 
c Middle Mgt. First Line — 2165 
8 Middle Mgt. Staff Spec. 2.345° 
s Middle Mgt. — Staff Spec. 2.181* 
u Middle Mgt. FirstLine 2.531” 
i Middle Mgt. Staff Spec. $305 
14 First Line Staff Spec. 2058" 
14 Middle Met. First Line 2.067% 
14 Middle Mgt. Staff Spec. 2.029* 
15 First Line Staff Spec. 2-326% 
is Middle Mgt, Staff Spec. 24097 
"i First Line Staff Spec. 2.710% 
17 Middle Mgt, First Line — 2408 
2 Middle Mgt. Staff Spec. 3.585¢ 
21 Top Mgt. First Line 2416 
21 — Top Mgt. First Line 2-625" 
M Top Mgt. Staff Spec 3.295%" 
i 2 Middle Mgt. FirstLine — 27197 
22 Middle Mgt. Staff Specs #088" 
22 Middle Mgt, First Line 2-22", 
23 Middle Mgt. Stafi Spec- 4026** 
23 Middle Mgt. Staff Spec- 3.116" 
24 First Line Staff Spec- 2.656" 
4 Middle Mgt. Staff Spec. 3.223" 
SS First Line Staff Spec. 2.691 
Significant us m 
Sane at ene OT Dre 


159 


TABLE 4 


PEARSON r CORRELATIONS ACROSS TWENTY-FOUR Jon 
CONDITION ITEM MEANS AMONG FOUR LEVELS OF 


MANAGEMENT 
Classes r 
Top Management vs. Middle Management .676 
Top Management vs. Staff Specialists 711 
Top Managment vs. First Line 638 
Middle Management vs. Staff Specialists 875 
Middle Management vs. First Line .930 
Staff Specialists vs. First Line 916 


tant, that the staff specialists rather than their 
organizational inferiors, first line supervisors, 
report less incidence of favorable conditions 
of work than any other level. The widely dis- 
cussed problem of the relative isolation of the 
technician and foreman from the mainstream 
of management and their organizational supe- 
riors may account in part for such findings. 
The relatively richer job environment of first 
line supervision relative to that of staff spe- 
cialists perhaps may be accounted for in 
terms of line staff differences in function. 
The job environments of the four man- 
agerial levels, as denoted by the 24 job condi- 
tions, were found to be highly correlated with 
one another (see Table 4). The intercorrela- 
tions among middle management, staff spe- 
cialists, and first line supervision’s reported 
environments are of a magnitude that equals 
the reliability of the scale. Top manage- 
ment’s reported environment, although sig- 
nificantly comparable to the respective envi- 
ronments of the other levels, tended to be of 
somewhat smaller magnitude. In general, 
however, in spite of significant differences in 
reported frequency of occurrence reported 
earlier, the profiles are essentially the same, 
These results suggest that the concept of 
“management climate” may have some valid- 
ity, i.e. that, in effect, an industrial organiza- 
tion provides a characteristic work atmos- 
here within which the management personnel 
play the roles. Herzberg, Mausner, Peterson, 
& Capwell (1957) suggested that managerial 
commonality in supervisory behavior is a 
function of the subordinate attempting to 


ssam - "n 
2 The odd-even reliability (corrected) of the de- 
scription of job environment scale was .92, 


160 


model his behavior after his superior. Insofar 
as the conditions probed in this study dealt 
largely with interpersonal relations with peers 
and superiors, perhaps each level, modeling 
after the level above and providing a model 
in turn for those below, created a character- 
istic pattern of work conditions that pervaded 
all levels. 
SuMMARY 


Four management levels in a moderately 
sized industrial firm were asked to describe 
the extent to which 24 desirable conditions of 
work characterized their respective job envi- 
ronments to determine whether or not, in 
spite of differences within the physical envi- 
ronments of the four levels, conditions relat- 
ing to (a) relation with immediate superiors, 
(b) relationship with company as an insti- 
tution, (c) relationship with organizational 
peers, and (d) role in decision making and 
implementation, differentiated among the 
levels. 

All management levels studied, in general, 
reported a high incidence of desired condi- 
tions of work. There was some evidence that 
relatively speaking Area 4, Decision Making 
and Implementation, was the area where de- 
sirable conditions of work were reported to be 
least prevalent. 

There was some evidence suggesting that 
richness of the job environments in terms of 
desirable conditions of work was related posi- 
tively to increased status in the hierarchy. 


Hjalmar Rosen 


This trend, however, indicated a dichotomous 
rather than a four-fold progression with the 
two upper echelons indicating significantly 
greater overall occurrence than the two lower 
levels. 

Staff specialists compared to the other 
levels indicated the least incidence of desit 
able conditions of work—particularly 2n 
understandably in Area 4, Decision Making 
and Implementation. . E 

The profiles re occurrence of desirable P 
ditions of work for all levels were significant Y 
parallel in spite of significant differences P 
magnitudes for given items between ph 
The top management level profile showed e 
greatest deviancy from other levels, howeV 


REFERENCES 
cecu- 

Arcyris, C. Some characteristics of successful exe 
tives. Personnel J., 1954, 32, 50-55. 0, 8 
HrmzBERG, F., Mausner, B., PETERSON, Ts Psy- 


Capwett, Dora F. Job attitudes. Pittsburg 
chological Service of Pittsburgh, 1957. york: 
McNemar, Q. Psychological statistics. New 
Wiley, 1949. sis ane senate? 
Rosen, H., & Weaver, C. G. Motivation 1n m h. 
ment: A study of four managerial levels. 
Psychol., 1960, 44, 386-392. 
Setznicx, P. An approach to the theory o 
racy. Amer. sociol. Rev., 1947, 8, 51-54. 1 
Smairn, R. A, The executive crack-up. Fort"? 
51, 108-111, 172, 177-178, 180. s, pusin 
Warner, W. L, & Anrconew, J. C. Big 955. 
leaders in America. New York: Harpers 1 wi rk? 
Wavre, W. H., Jr. How hard do executives 
Fortune, 1954, 49, 108-111, 150-152. 


f bureauc 


(Received June 13, 1960) 


Journal of Appli 
Applied Psychology 
1961, Vol. 45^ No. 3 Yel tes 


DRIVER JUDGMENTS OF RELATIVE CAR VELOCITIES 


PAUL L. OLSON, ROBER 


T A. WACHSLER, AND HERBERT J. BAUER 


General Motors Research. Laboratories 


Bue opes of an automobile is almost 
Mating is aced with a problem of esti- 
ive to his velocities of preceding cars rela- 
igh-speed own speed. The great number of 
these jud rear end collisions indicates that 
i ien ata may not be made accurately. 
een veri e d widely held opinion has not 
a bret experimentally however- Thus, 
reducing in de step to finding ways of 
Visable to is type of accident it seemed ad- 
ese aa. the ability of people to make 
" ative velocity judgments. 
fold, Fin rpose of this investigation was two- 
tan E to learn how accurately drivers 
Wn ang mine whether the gap between their 
Mg const. à preceding car was opening, hold- 
welt a or closing. Second, to determine 
rent rat rivers can discriminate among dif- 
es of change of this gap. 


Ty METHOD 
RM X . 
adi atio dependent variables were selected for in- 
fost of. "These were: the direction and rate of 
lowing the distance between the lead and the 
vehicles ay chicles, and the spacing between the two 
*tiog, at the start of the subjects’ observation 
different ones were 
s during the experi- 
epancies usually 


Wero c 

ia on me Were required and 

not o E of the several day: 

teat in q Ompensate for the discri 

Watlibrated speedometers, that of the lead car was 

ta Omm to a standard in the follower car. Two- 
Mheq Unication between the vehicles was main- 


5 " 

ihe di nans of portable short-wave radio. 
vau te t ction and rate of change of the gap 
vap City obs two cars was controlled by holding the 
NO th the following car constant at 40 mph and 
rel in ihe velocity of the lead car from 10 to "9 
at ey Tements of 10 mph. This established seven 
mal opt conditions with a maximum differ- 
tye tenths mph. Two separations, one-tenth and 
Me chic} of a mile, were established between the 
mag Ge at the start of each observation perloc. 
tag d S were set up with the aid of one-tenth 
iy E installed along the edge of the test 
Mag Vals, Sons the study was conducted. The two 
Vere gether with the seven speed conditions, 
^a "um of 14 experimental conditions which 
ach MStered in one of three random orders 


161 


Eleven adult subjects were employed, all experi- 
enced drivers. They rode as front seat passengers in 
the follower vehicle and at specified times observed 
the behavior of the lead vehicle. In practice, the 
subject was driven to the test track where he was 
briefly oriented regarding the purpose of the study 
and given his instructions. The subjects were told 
about the range of speed discrepancies and that the 
speed differences would be in multiples of 10 mph 
They were not told how many judgments fhey 
would make or anything about the order of 
presentation. The orientation was followed by a 
practice run in which the subject made two judg- 
ments, comparing his estimates with actual condi- 
tions. Any final questions were then answered and 
the treatments administered. 

The procedure for all experimental conditions 
was identical. The subject rode next to the driver 
of the follower car with his eyes averted from the 
lead car. The driver of the lead car varied his speed 
to open or close the interval between his car and 
the trailing vehicle to the required distance. When 
this spacing had been achieved the operator of the 
lead car was advised to adjust his speed to that 
called for by the particular treatment. When the 
speed discrepancy had been established the subject 
was notified, via the radio, to make a judgment, At 
this signal the subject was allowed to look up and 
observe the lead car for a maximum of 7 seconds 
At the end of this time he entered his judgment 
on a data sheet in two parts; first, whether he 
judged the interval to be opening, closing, or holding 
constant and, second, an estimate of the speed 
discrepancy between the two cars in miles per hour 

An analysis of variance was used as the first step 
of the analysis, the Duncan procedure being em- 
ployed to test differences between individual means 
An analysis of the type described by Garner and 
Hake (1951) was run to determine the amount of 
information transmitted under the separation condi 
tions. As a final step a correlational analysis w: d 
run on the data. as 


RESULTS 


The matrix presented in Table 1 contrasts 
subjects’ judgments with actual conditions 
The row and column headings are pressed 
in terms of speed discrepancies. For example 
a judgment of —30 implies that the sed 
thought that the gap was closing and th 
speed discrepancy was 30 mph. Mean jud e 
ments for each condition are listed along dg- 
right edge of the table. Frequencies je 


162 P. L. Olson, R. A. Wachsler, and H. J. Bauer 
TABLE 1 
DISTRIBUTION OF JUDGMENTS MADE UNDER EACH EXPERIMENTAL CONDITION 
Subjects’ Judgments of Speed Difference in dum 
Actual Separation "AIT Miles Per Hour Aud 
dgment 
jc oc, Mike —30 —20 —10 0 +10 +20 +30 Judgm 
i 4 7 s 
as 3 2 6 3 +10.9 
15.4 
2 j 6 1 + 
10 2 1 
n 20 1 2 4 2 2 03.6 
+09.1 
3 6 2 
- m 3 3 | 4 1 +02.7 
i 0 
" Mali ET 
.20 1 5 4 id 
—08.2 
—10 40 3 3 5 a 
.20 1 3 5 2 
—22.7 
bs 2 à ^ à —20.0 
.20 4 3 4 : 
—26. 
—30 10 8 2 1 ET 
j 20 7 3 1 
. ces 
* " : " A ed differen 
in the outlined diagonal are correct judg- The analysis of variance show the meat 
menis significant at the .01 level apes We spe? 
There was a total of 154 judgments, of judgments made under the e S alts 0 
which 62 were correct, Of the 92 which were conditions at both separations. f 


incorrect, 62 were conservative in that the 
subjects made judgments which were under- 
estimations of the lead car's Speed. It is par- 
ticularly interesting to note that there were 
only three reversal errors (errors where the 
Subject said that the gap was opening when 
it was really closing, or vice versa). All these 
errors occurred under the same experimental 
conditions (0.2-mile separation and +20- 
mph speed differential) and amounted to 
underestimations of the lead car's speed. 
There were seven decisions which were 
potentially dangerous, where the subject 
judged the gap to be constant or opening 


when in reality it was closing. All these 


errors were made when the speed differential 
was —10 mph. It should be noted that even 
under the .10-mile Separation condition a 
speed discrepancy of the magnitude of —10 
would allow more than a half minute for the 
individual to change his mind before it was 
too late. The subjects in this study were 
allowed only 7 seconds observation. 


the Duncan tests indicated that m 
the 91 pairs of means did not differ 
cantly at the .05 level. on the 
An information analysis was run ue 
data presented in Table 1 to dp 
amount of information in bits pers ops 
under each of the separation con abject 
Under ideal circumstances, where ns ie 
unerringly guessed the direction of E g1 bit 
well as the exact speed discrepancy, 4+ timui 
would have been transmitted by seven § The 
such as were employed in this rwr pit 
calculations showed that 1.05 and se mile 
were transmitted at the .20- and 4 
ions, respectively. tec 
üt ie Table 1 that sub) ne 
tended to underestimate the speed : as 
lead car. The overall mean differe ondi 
—4.6 mph. The mean estimates at s fit yine 
tion are plotted in Figure 1. The bes the a 
drawn through the data illustrates mea” 
curacy with which one could predict z phe 
response knowing the actual conditions: 


nifi- 


Driver Judgments of Car Velocities 


Actual Conditions 


-30 A 


= -20 -10 o +10 +20 +30 
à Mean Subject Judgment 


Fic, n " 
M Relationship between actual speed differences 
Subjects! estimates of the speed differences. 


ae between the mean estimates and 
sees conditions was very high, r — .99. 
Ber ^ eration of the overall means, how- 
or reas to mask much interesting data. 
eu a tecum. it should be noted that the 
eir "ie much more accurate in making 
cars E gments when the gap between the 
à ud closing than when it was opening. 
mph = error of estimate was only 1.5 
With g re the gap was closing as contrasted 
estimat mph when it was opening. Subjects’ 
ile in Were also more accurate at the .10- 
istanc, erval as compared with the .20-mile 
tation ^ "The mean error at the .10-mile sepa- 
lle se Nas 0.9 mph, while that at the .20- 
oo was 6.8 mph. : 
ade S 2 shows a plot of mean judgments 
ifferent the various speed conditions at the 
Stimat separations, It can be seen that the 
th, es tended to be quite accurate when 
especial was closing in either instance but, 
ly at the +20 and +30 differential, 
€stimates made at the longer distances 


ere 
much less accurate. 


Actual Conditions 
o 
é 


-20 


+20 +30 


240 -10 o +10 
Mean Subject Judgment 


Fr 

G, " 
pices ang Relationship between actual speed differ- 
9r Subjects’ estimates of the speed differences 


Si H ge 
paration condition, 


163 


DISCUSSION 


It seems clear from the data that people 
are capable of rather accurate discriminations 
in making judgments regarding velocity dif- 
ferentials. Most of the errors were made in 
a direction which would, if anything, pre- 
cipitate more caution on the part of the 
operator than might have been deemed neces- 
sary if the judgment had been correct. 

The information analysis revealed that the 
subjects were receiving only enough informa- 
tion to reduce uncertainty by about one-half. 
This would imply that the subjects were able 
to reject three to four possibilities in each 
judgment situation and make a random 
choice among the others. From the data it is 
apparent that the subjects seldom erred in 
judging whether the gap was opening or 
closing though there was some confusion with 
the constant situation. Thus the problem ap- 
peared to be such that the subjects could 
determine the direction of change without too 
much trouble, but they were much less certain 
when estimating the precise speed differential. 

The most accurate judgments were made 
at the closer distances and under conditions 
where the gap was closing. This is not sur- 
prising certainly, but it is comforting. It is 
particularly interesting to note that there 
were very few dangerous decisions and that 
all of these were made when the gap was 
closing at the minimum rate. 

In general then, people tend to do rather 
well in making the type of judgments called 
for in this study. It further would appear 
there there is little reason to believe that 
dangerous actions would be frequently based 
on the information supplied by these types 
of judgments. 


SUMMARY AND CONCLUSIONS 


Eleven subjects were evaluated for their 
ability to detect the direction and rate of 
change of the interval separating the car in 
which they were riding from a preceding car. 
This interval was set at one of two magni- 
tudes and could remain constant or open or 
close at one of three rates. 

The following conclusions are based on the 
data collected: 


1. In the range of speed differences tested, 
people tend to be quite accurate in determin- 


164 


ing whether the distance between their car and 
a preceding one is increasing or decreasing. 

2. People exhibit a better than chance 
ability to discriminate between opening and 
closing rates at least as fine as 10 mph. 

3. The accuracy with which judgments 
such as these can be made increases as the 
distance between the vehicles decreases. 

4. Judgments are made more accurately 
when the gap is closing than when it is 
opening. 


P. L. Olson, R. A. Wachsler, and H. J. Bauer 


5. In the range of speed differences studied, 
subjects tended to underestimate the relative 
speed differential between their car and the 
one in front of it. i 


REFERENCE 


Garner, W. R, & Haxe, H. W. The amount of 
information in absolute judgments. Psychol. Re? 
1951, 58, 446-459. 


(Received July 1, 1960) 


— = 


? 


X 


| 
raced fi 
g 
modif 
imp, 
and 
3 


Journal o if 


1961, VoL rud Psychology 


No. 3, 165-169 


A COMPARISON OF TWO METHODS OF TRAINING 
IN A COMPLEX TASK BY MEANS 
OF TASK SIMULATION" 


One of 


and the most obvious trends in military 


E operations has been the gradual 
asic a skill requirements for workers 
change i s have become more complex. The 
the nee, the nature of the jobs has created 
'ainin Tu Coordinated changes in worker 
raining philosophy, methods, and 
are all susceptible to review and 
erative i the face of these increasingly 
echni requirements for skilled operators 
Iclans, 
Within Present experiment was undertaken 


enis e 
Tipment 
Dodi cation 


Moni Context provided by one of the more 
R ng developments in the training field; 
is noe’, SYstem simulation. Simulation itself 
Mey ot New concept. The Link trainer of 
Modern. d War II vintage is a relatively 
i ipa aPplication. However, recent progress 
> TUmenting the simulation process, in 


Ma F é 
"e with the changing nature of the 


Com 
task 
the © 
sity 


N 


€ learned, has greatly enhanced both 


ur 3 
rent usefulness and future potential of 


at 
ewe ee techniques (Chapman, ae 


lng hist Biel, 1959; Goodwin, 19 
ibiq Oty of simulation and its present 
gt acce opment, unfortunately, have not 
abor. q Panied by very extensive research 
M Wh In the particular area of pilot tram- 
xm Ru. most of the research to date has 
Were no plished, there are still more un- 

Questions (Muckler, 


& Williams, 
of detailed problems e 
nderstood is the influence of e 
f training efficiency in a simulate 
EL ic load was selected for investiga- 
URL Present experiment since its influ- 
of is 
U,Aviaus 
fed B n 
tg) at 
ty, "lig, 701 
n. lon, 
Amat, Permission is granted 
ln Dart Publication, use, and 
Y or for the United State 


lo pong 


tting n 


Laboratory 
ted by the 
No. AF 


h was carried out in the 
SYchology and was suppor 
es Air Force under Contract 


» monitored by the Aerospace Medical 
for reproduction, 


disposal in whole 
S Government. 


165 


J. S. KIDD 


Ohio State University 


ence has already been established in quite 
different task settings. Barch (1953), Barch 
and Lewis (1954), Green (1955), and 
Szafran and Welford (1950) have all found 
that a relatively high level of “task difficulty" 
or task load imposed relatively early in the 
learning sequence was facilitative. The tasks 
employed were predominantly of the type 
dependent on psychomotor skills. In the 
present instance the basic hypothesis is ex- 
tended to a predominantly discrimination- 
decision making task provided by the simula- 
tion of a radar air-traffic control center 


operation. 
METHOD 


Apparatus, Task, and Subjects 


The general task environment was provided by the 
simulation within the laboratory of a radar air traffic 
control center. The simulation was implemented by 
the specially developed OSU Electronic Air Traffic 
Control Simulator (Hixson, Harter, Warren, & 
Cowan, 1954). This device, which is built around 
an analog computer, is capable of generating up to 
30 aircraft targets and presenting them realistically 
to the radar controller via a cathode ray tube 
display. Direct manipulation of the “aircraft” is 
accomplished by college students trained to faithfully 
carry out pilot functions. In addition to the visual 
display of aircraft position available to the con- 
troller, he is in direct auditory communication with 
the “pilots” under his jurisdiction through simulated 
radio channels. 

The task requires S to act as a radar controller 
He is responsible for the guidance of aircraft within 
a specified zone of responsibility. The normal x 
proach route is 50 mi. in length. The controller 
must manipulate the position, heading, airspeed, and 
altitude of the aircraft under his direction While 
they are coming in to land. He must see to it that 
the landing approach is made expeditiously and 
safely. 

The 16 novice controllers who participated in this 
study were selected from a total population of A 
that number of undergraduate students at ORC 
State University. These 32 students were initially 
employed in earlier studies as pilots and thus Wohl 
familiar with the simulation operations, Selecti e 
was based on the profidency shown during i. 


166 


training sessions. The best 16 Ss were chosen as 
controllers. 


Description of Experimental Variable 


The results of prior experiments in this series have 
led to the conclusion that the number oí aircraít 
under control (rather than, for example, the entry 
rate parameter per se) is the most valid single 
determinant of input load (Kidd & Kinkade, 
1958). It was possible through the use of special 
procedures to maintain the number of aircraft under 
control (target density) at any desired level for any 
required duration. Thus, input load level in these 
terms could be maintained at a constant level 
throughout a given problem. 


A comparison was made in this study between 
training under consistently high input load condi- 
tions (Experimental Group I) vs. training under 
conditions of gradually increasing input load (Ex- 
perimental Group II). 

The exact level of input load assigned to Experi- 
mental Group II was determined on the basis of 
data obtained during a set of special pre-experimental 
trials. Three novice controllers, having a tested ability 
level at the median value of the total sample used 
in this study, were each given a series of 10 special 
problems. The target density was varied from 
problem to problem within the range from four to 
six aircraft under control. The order of the problems 
was random for each controller. The results were 
pooled to provide a base relationship between target 
density and average control time, one of the funda- 
mental criteria of performance, and between training 
trials and average control time, By arbitrarily select- 
ing a single value on the average control time 
dimension, it was possible to derive an equation 
which would give a fair approximation of the trade- 
off function between the level of training and target 
density. Thus, performance could be theoretically 
held constant across training trials by simultaneously 
manipulating density. The graduated input load 
Schedule was determined on this basis. The resultant 
progression in order was 4, 4, 4, 5, 5, 5, 5, 6, 6, 
6 aircraft under control for the series of 10 problems 
in this experimental condition. 

The other experimental condition (Experimental 
Group I) required that the level of aircraft under 
control be held at six aircraft for all trials. A. graphic 
comparison of the two experimental conditions is 
presented in Figure 1. The three novice controllers 
utilized in the preliminary evaluation of the vari- 
ables were not included in the study proper. 


Initial Training and Matching Trials 


In order to provide dat 
and matching of participa; 
training was undertaken prior to the study proper. 
The initial training period involved a total of 8 hr. 
Seven hr. were spent in classroom training as a 
group, and 1 hr. was devoted to individual training. 
In detailed breakdown, the preliminary training con- 


a for selection purposes 
nts, a program of initial 


J. S. Kidd 


---- Group | (Conston! Lood) 
Group II (Graduoted Load) 


Constant Number of 
Aircroft Under Control 


1 2 €* w-s—e rt a 
Triols " 
. i l vari- 
Fic. 1. Changes in problem difficulty at ental 


ous stages of practice for the two experim 
groups. 


; ir 
sisted of 2 hr. of lectures on the principles San 
traffic control given by E, 4 hr. in practice 0 ulated 
ponent skills, 1 hr, of observation in the S CE = 
control center, and 1 hr. of operational prac 
a controller. " 

For component skill training, stimulus 
were recorded on a strip film for group pre 
and conditions of both written and verbal nance 
were employed. The exercises in aircraft performs 
characteristics were in the form of written K ertain 
in which the trainees were given the values 0 ture 0 
variables and were asked to estimate the na 
the required command. that the 

The observation sessions were set up 50 č j 
trainees could observe the performance of an € 
enced controller during a typical exercise. TI 
tional practice was carried out under low 5 
conditions with maximum feedback of kn ractic? 
performance. The results of the operational P the 
trials were utilized for matching purposes 
experiment proper. 


respons 


Statistical Design and Procedure d 

The nature of the input load variable ert one 
in this study required that independent se d con” 
troller trainees be used in each experimen er, Vs 
dition. In order to maximize statistical POWs n the 
matched-pair technique was used. That a 
basis of the 1-hr. practice trial, the novice Co E 
were ranked according to initial proficiency- nc and 
were paired on the basis of initial peut à 
one of each pair assigned randomly to On 
two experimental groups, the other 5 going 
remaining group. 3 p 

eer dert, sessions were scheduled in den 
sequence. Order effects were minimized by é probe 
anced schedule. Each session consisted of fiv dure 
lems or exercises. Each exercise was SOME 10 
tion. Each controller-trainee participate 
exercises during the course of two successive 
Separated by a 24-hr. interval. 


Measures of Performance 


e? 
: syst 
In order to achieve a detailed covens? am i T» 
performance, multiple criteria were emp E n pe 
most reliable of these measures has been ™ 


ON 


Training by Means oj Task Simulation 


Lender Which was calculated by determining 
Processed um theoretical flight time for each aircraft 
flight tim subtracting this figure from the observed 
i us and dividing by minimum flight time. 
aircraft Lice a then averaged for all 
E probes” em, giving a mean delay score for 
ed fe ene was a second continuous 
crait on th Was computed separately for each air- 
Curves which basis of hypothetical fuel consumption 
craft tyne, | took into account three factors: air- 
wo o airspeed, and altitude. 
Failures Noncontinuous measures were employed. 
tallieq R achieve proper landing set-ups were 
ined one such measure and the number of 
eos per problem was the other. A sepa- 
ror was defined as the failure to maintain 
ight time minimum distance between air- 
Standard translated to as much as 6 
Or 6,000-8,000 ít. vertical separation 
Stages of the landing approach. 


Separati 
Tation 
30-sec. 
craft, 
i ls 
(eral miles 
ring early 


RESULTS 


i s first consideration is the over-all learn- 
the A by the novice controllers during 
Detform n8 sessions. The main continuous 
Sentence measure, flight delay, is pre- 
Pto graphic form in Figure 2. The 
Ex eri reduction in excess flight time for 
ttainine otal Group I indicates the effect of 
Ury; E under constant load conditions. The 
ing ^, 9T Experimental Group II is broken 
be, Tee sections to indicate the break 


We 
en levels of input load. 


Btessive 


” —-—- Groupl (Constant Load) 


gr ^s. ——— Group I (Groducted Lood) 
2 ua 

à N, 

= 80) Ni 

z N, A 

: E LM AUT 
5 a 

e r 

NES 

Ei 

Ed 

= 


TP: s 4 $ & T 8 9 
Trials 


in mean percentage delay as a 
function of practice. 


Fic. 2. Change 

A statistical comparison between experi- 
mental conditions was made on the basis of 
performance on Trial 10 which is regarded as 
the test trial. Since the participants were 
matched on the basis of a preliminary test 
trial, the £ test for matched groups and the 
Walsh test were appropriate. The results of 
these tests for all criteria employed are pre- 
sented in Table 1. The superiority of Group I 
over Group II is supported at the .01 prob- 
ability level on the criterion of excess flight 
time. The probability level for excess fuel 
consumption is .025. The differences between 
the two groups on landing set-up errors and 
on separation errors per aircraft processed are 
not significant. However, total aircraft proc- 
essed per 30-min. trial is significant at the 


TABLE 1 


SUMMARY OF STATISTICAL 


Trsrs or DIFFE 
AND EXPERIMENTAL GROUP 


RENCES BETWEEN EXPERIMENTAL 
Il 


Group I 
Cy Mean Score Statistical Test 
iter} Result 
Um "on Measure Group I Group H Test ? 
VES t 123.4 
Mey XCess flight times 35 57 Matched «O01 
€x 
Suy, CESS fu 
Me i. ^al den- i 56 Matched ¢ 1-225 <.025 
A s 
eang; 
Cry, din 
Or, E set-u d; <0 
a 8 Der aircraft p 038 086 Walsh test " 
se 3 
Dep , Parati 
Meg iterate n Stfors 056 086 Walsh test d; «0 
by, Pm : Ó 
"oce ber of aircraft Walsh test d, «0 055 
5 Der-30 min, 20 17.5 
No . 
Ate. 
t Mean Baseq on ei ^ i minimum theoretical flight time. 
f Ss. ii time over H S aa e. 
Mean P Tene at Duis oe. OF eure ratio of cree mu Hd consumption over minimum theoretical fuel consumption, 


“cess fuel consumption is the ave 


——=— Group 1 (Constant Lead) 
189 Group I (Graduated Load) 
160] 
$ wo 
r4 
2 120 
5 
$ 
8 100 
& 
e 80 
2 
E 
2 60) 
E 
5 
rs) 
40) 


20 


111 
2 3 a 


D 5 


Triols 


a 1 
6 7 8 3 © 


Fic. 3. Cumulative average processing rate (number 
of aircraft landed). 


-055 level. On no measure of performance did 
there occur a reversal of the major trend. 

A somewhat different approach to the 
analysis of these results leads to a considera- 
tion of the mechanics involved in the differ- 
ence noted between the two groups. A 
potentially important consideration is the 
cumulative number of aircraft processed over 
the total 10 trials. Figure 3 compares the 
two groups on this index. Group I, in this 
case, starts at a higher level and maintains 
a greater rate of increase throughout the 
Series. Figure 4 compares the two groups on 
cumulative gross error frequency. 


2007 ——= Group I (Constant Lood) 
L Group II (Graduated Load) 
180} 
160}- 
Py 
eol 
© io 
w 
ge sull 
$ 'eo[- 
E k 
100 
ECUE 
5 8o[- 
E 
P: 
o 60| 


40 


20| 


4 5 g * 8 9 10 
Triols 


Fic. 4. Cumulative relative gross error frequency 
(landing set-up errors plus separation errors). 


DISCUSSION 


The agreement between the results of E 
present study and others (Barch, 1953; ert 
& Lewis, 1954; Green, 1955; Szafran & ‘et 
ford, 1950) which have employed a sim! - 
variable is substantial. This Mer qe 
made more significant by the fact that n 
most previous studies employed tasks "kis, 
depended upon the participants! motor 1 
the present task was largely percep ristic 
cognitive in nature. This latter charac ve 5 
is of progressively increasing ioi pom f 
industrial tasks increase in complexity (Gas) 
& Bolles, 1958). g^ 

In spite of the agreement in the dmn 
ever, there is not at present a well po 
explanation for the common outcome. ^ au 
nisms such as expectancy, frustration, none 
motivation have been suggested, = study 
has gained ascendency. In the a^ rentia" 
it is not possible to make a clear diffe nism 
tion between the actions of the miecha i 
that have been proposed so far. d 
is possible to emphasize certain aspec nation. 
results which may provide some Sent 

Such an analysis requires a pue ji 
of the fundamental effects of re earning 
punishment on the learning process. unde! 
is generally thought to take dp 2 
conditions of both reward (for co corte? 
sponses) and punishment (for ! 
responses). It may be assumed for the E» : 
that in the task utilized here, each Mopy 
landed constituted a reward and eac was ? 
ration error and landing set-up error resent 
punishment (unpleasant event). THe P many 
study provides an accurate portraya to the 
nonlaboratory learning situations a nd 
fact that both kinds of events, poe 
punishment, occurred throughout tral j 


J. S. Kidd | 


ith 

If Experimental Group I is compare gute 
Experimental Group II as is done 1n iffe 
3 and 4, it is apparent that there NS d s fof 
ential frequencies of both kinds of A: expe 9 
the two groups. Experimental Group aid ish 
enced both more rewards and mo, 1 ov 
ments than did Experimental Group is dis 
if it is postulated that punishment d in 
ruptive during learning and leads to iral e 
centive to continue by frustrating the nin£ 
the prediction would be that the gn 
Group I would be retarded. However, 


Training by Means of Task Simulation 


d out to be substantially superior to 
up II. 
is 1 might be concluded, therefore, that it 
level of € kind of reinforcement but the total 
factor E ecgonuir which is the operative 
edge 4 us, the level of feedback of knowl- 
Group I results is higher for Experimental 
cesses ard. both counts, knowledge of suc- 
Simp] aad knowledge of errors. Group I 
ormati experienced a heightened rate of in- 
Superi on feedback and this eventuated in 
Nhi. performance. 
e a É above speculations are compati- 
* equall he data, there are others that would 
Particip Y so. It is possible, for example, that 
a abe d in Experimental Group II learned 
Conditio of responses under low input load 
igher ms that were inappropriate to the 
nge Input load conditions. Thus, at each 
have As input load, new responses would 
interese learned under some degree of 
Quireq inm from the responses already ac- 
E Derin is likewise plausible to suggest that 
OCtrin nental Group I participants were 10- 
bini to a higher level of effort and 
üs 9n by the implicit suggestion that it 
hig) Possible to operate efficiently under 
t 


h s 
h €r input load conditions. Thus, rather 
the higher 


Motivationally detrimental, 
atible 


With 2d may have been quite comp 
e achievement aspirations of the 
ants, 

im io oretica] mechanisms aside, the practical 
the a of the findings, while limited by 
telatiy tions of the study, appear to be 
tively ely straightforward. Thus, given a rela- 
Don complex task wherein most of the com- 
latis Skills have been previously acquired, à 
ub ely high level of activity at the outset 
aisi System-training phase of the total 
its zi Program seems to be desirable when 


Patticip 


wi p ticipated high initial error frequency 
iavolve disastrous consequences: 
In SUMMARY 
Ma Drovement in performance with training 


iis traffic control 


"plex task of radar air 


169 


was compared under a condition of constant 
high input load during training vs. a condi- 
tion of graduated input load during training. 
Relative input load was defined as the num- 
ber of aircraft under the control of a single 
operator. 

The test performance of Ss trained under 
constant high input load was significantly 
superior on several criteria to that of Ss 
trained under the graduated input load 


condition. 

An explanation was proposed in terms of 
the heightened frequency of feedback of 
knowledge of performance experienced by the 
high constant input load group. 


REFERENCES 


Barcn, A. M. The effect of difficulty of task on pro- 
active facilitation and interference. J, exp. Psychol. 
1953, 46, 37-42. * > 

Barcn, A. M., & Lewis, D. The effect of difñculty 
and amount of practice on transfer. J. exp. Poya 
chol., 1954, 48, 13 141. i 

CmapMax, R. L, Kennepy, J. L, NEWELL, A, & 
Bir, W. C. The System Research Laboratory’s 
air defense experiments. Mgmt. Sci, 1959, 5, 250- 
269. 

Gacxe, R. M., & Borres, R. C. A review of factors 
in learning efficiency. USAF OSR tech. Note, 1958 
No. 58-924. s 

Goopwiy, W. R. The System Development Corpora- 
tion and system training. Amer. Psychologist, 1957, 
12, 524-528. 

Green, R. F. Transfer of skill in a following track- 
ing task as a function of difficulty. J. Psychol 
1955, 39, 355-370. si d 

Hixson, W. C, Harter, G. A, Warren, C. E, & 
Cowax, J. D. Àn electronic radar target simulator 
for air traffic control studies. USAF WADC tech. 
Rep., 1954, No. 54-569. 

Kip», J. S. & KINKADE, R. G. Air traffic control 
system effectiveness as a function of the division 
of responsibility between pilots and ground con- 
trollers: A study in human engineering aspects of 
radar air traffic control. USAF WADC tech. Rep 
1958, No. 58-113. Yi 

MUCKLER, F. A., Nycaarp, J. E., O'KrrLY, L. I, & 
WILLIAMS, A. C., Jr. Psychological variables in The 
design of fight simulators for training. USAF 
WADC tech. Rep., 1959, No. 56-369. ; 

szarran, A. T, & Wrrromp, B. On the relation be. 
tween transfer and difficulty of initial task, SEN 
J. exp. Psychol, 1950, 2, 88-99. s 


(Received July 5, 1960) 


Journal of Applied Psychology 
1961, Vol. 45, No. 3, 170-174 


DRIVER OPINIONS AND REPORTED PERFORMANCE 


UNDER VARIOUS INTERCHA? 


GE MARKING AND 


NIGHTTIME VISIBILITY CONDITIONS' 


MARVIN D. DUNNETTE 


University of Minnesota 


Nighttime driving conditions offer special 
problems of visibility. This is especially true 
at highway intersections. As a driver proceeds 
over any highway system, he continually ar- 
rives at a series of intersection choice points. 
Most drivers know where they want to go, 
but they do not always know exactly how to 
get there. It is, therefore, of obvious impor- 
tance to develop and utilize systems which 
will enhance nighttime visibility and thereby 
provide drivers with optimal information 
about the route or routes they may be fol- 
lowing. 

These considerations point up the impor- 
tance of providing adequate markings and 
conditions of visibility at highway intersec- 
tions. Highway systems throughout the coun- 
try have made wide and effective use of 
illumination and reflectorization to accom- 
plish these aims. A good deal of research 
utilizing direct physical measurement has 
been performed in an effort to assess the 
degree of visibility improvement under a 
variety of conditions of illumination, 

In addition to widespread research on 
levels of visibility and their relative effective- 
ness, attention has been given to the relative 
utility of different marking systems in direct- 
ing or guiding driver performance. As men- 
tioned previously, appropriate guidance of 
drivers is particularly important at intersec- 
tions; the marking System should be sufficient 
to reduce any potential confusion or error on 
the part of the driver. 

The study reported here was designed to 
discover possible effects of different highway 
nighttime Visibility conditions and different 
highway marking systems on driver perform- 
ance. Research was undertaken over a period 

1 This article is a revision of papers which were 
read before meetings of the Highway Research 
Board in Washington, D. C., on January 13, 1960, 


and the Midwestern Psychological Association in St. 
Louis, Missouri on April 30, 1960. 


of 7 weeks during the summer of 1955, M 
experiments were conducted in the sta "a 
Minnesota on a cloverleaf interchange ^] Md 
by the intersection of U. S. Highway € *. 
Minnesota State Highway 36. A eor 
experimental conditions of varying id 
and using varying systems of highway mane 
ings was utilized and driver perfor E 
studied. All experimental studies were be 
ducted during night driving conditions 
tween the hours of 9:30-11:30 P.M. 


METHOD 


2 eriod 
Five conditions were employed during the P 
of the study: p. 
i ition 
Condition I, the Fully Illuminated bene phe 
plicated conditions typical for the intere js sul 
mercury vapor luminaires were turned on n was €^ 
and no special treatment of reflectorizatio ; 
ployed. 6 sd simply 
Condition II, the Dark condition, consiste - 
of turning off the luminaires, still using a 
zation other than the signs employed to 


^ m 
lights remained off, but reflective trenton’ ^ Joop 
ployed in the form of amber delineators in 


rds 

Kus tanda 
and legs of the cloverleaf similar to in Contr? 
contained in the Manual on Uniform Tra, ate of 


Devices for Streets and Highways of the St í 
Minn A D 
Condition IV utilized an experimental metho! od 
reflectorization.? The luminaires remained ER re- 
blue and amber delineators and blue and ens areas 
flective pavement paints were used to indies orici 
of exiting and merging traffic. The entire. ve hich 
interchange was not treated; only the poris and 
? Figures showing the placement of delien in 
the nature of the pavement treatment emp m the 
Conditions IV and V may be obtained fro 
author upon request. iment? 
A detailed technical description of the eet itz 
reflectorization which was employed is given d e 
patrick, Joseph T., Integrated reflective aa 
ment and delineation treatments for nig! A cop 
guidance, unpublished paper, January 1900 EE 
of this paper may be obtained from M. D. 
upon request. 


170 


Driver Opinions and Reported Performance 


m hin oe visible to a motorist traveling 
southeast cu, S. 61 or traversing the ramps in the 
Canes received reflective treatment. 
mination n : combined the treatments of full illu- 
luminaires ane Experimental Reflectorization. The 
Ras Ringer ce turned on. The reflective treatment 

Intained as in Condition IV. 


ee of this study was to study driver per- 
Visibility ved these various different conditions of 
Was studied highway marking systems. Performance 
Taversed p by interviewing motorists after they had 
located at € intersection. Interviewing stations were 
viewed at two points, A and B. Motorists inter- 
ES! 61 Station A were those who had just left 
fasterly bus Were about to enter and proceed in an 
Viewed atop on Minnesota 36. Motorists inter- 
Straight th tation B were those who had proceeded 
jphwi Bh the interchange from south to north 
entered ga U. S. 61 and also those who had just 
leat loop į; S. 61 from Minnesota 36 via the clover- 
ior | n the southeast quadrant. 
Rue p ny via press, radio, and TV referred to 
tous exp àt a study was to be conducted using var- 
described pect conditions. None of the publicity 
formati details of the conditions nor was any M- 
drive 9^ supplied which could be helpful to local 
iperimen, erecting the meaning of the various 


d siotorists approached Points A and B, they 
Seti, qed to stop and were asked to answer a 
Mteryjg questions requiring about 5 minutes. The 
tion in W schedule was designed to obtain informa- 

ve major areas: 
iml. 
ay wignal information such as sex, 
the interchange, etc. : 
“Xberien, Xtent to which the driver did or did not 
oute ce difficulty in choosing the correct turn Or 

3. Sy Tough the interchange 
for i EBestions, if any, that the driver might. offer 
hag Proving the guidance system used in the inter- 


age, familiar- 


* The e 


4, 1 
d hs qnas used by the driver or found helpful 
“teh s ds recognizing certain critical response zones 

fas of exitin, ing traffic 

iv g and merging d 
itive ; dual impressions or opinions voiced by 
us mdi cerning the reflectorized treatment used 

tions IV and V 

d, 


* Sample 


total o 


Stations 1137 motorists was interviewed at the 


S A and B. The numbers interviewed 
ition bot 199 for Condition V and 270 for 


terviewed were 
only 167 were 
he drivers were 
der being dis- 
d over-41 


ar . 
> con, majority of the motorists in 
Men, Prising 970 of the drivers; 
tin’ age poo Dat fewer than half tl 

uteg -ange 26-40 with the remain 
Ups “Wally between the under-25 an 


lay, p 
Se majority of drivers participating in the 


171 


study were familiar with the cloverleaf interchange. 
Over half said they used the interchange daily. An 
additional 3076 reported using the interchange at 
least once a week or oftener. Fewer than one in six 
reported being totally unfamiliar with the inter- 
change. Examination of the frequencies of use of the 
intersection by drivers under the different experi- 
mental conditions showed no differences. At both 
interviewing stations, chi square tests showed no sig- 
nificant relation between familiarity with the inter- 
change and experimental condition. 


RESULTS 


Only a small minority of respondents said 
they experienced any difficulty making their 
way through the intersection. The numbers 
and percentages of persons saying they had 
some difficulty are shown in Table 1. It may 
be noted that at Station A, the highest inci- 
dence of driver difficulty occurred under Con- 
ditions II and III. Under these two condi- 
tions, nearly one in eight drivers experienced 
difficulty locating the exit ramp to Minnesota 
36. Under the Fully Illuminated and experi- 
mentally reflectorized conditions (Conditions 
I and IV), practically no one (fewer than 1 in 
50) experienced difficulty. 

This finding is important for two reasons: 
(a) lighting is confirmed to be an effective 
way of reducing driver confusion and possible 
error, and (b) the Experimental Reflectoriza- 
tion is shown also to be an effective means of 
reducing driver difficulty in traversing the 
interchange. 

At Station B, only 12 drivers (about 1%) 
experienced any difficulty traversing the in- 
terchange. This is an expected result since it 
is easier to drive straight through an inter- 
section than to locate a particular point or 
turn off. 

Table 2 shows the numbers of motorists 
who volunteered suggestions for improving 
the marking or visibility of the intersection 
in some way. At both stations, fewest sugges- 
tions for improvement occurred under the 
two conditions employing Experimental Re- 
flectorization. 

It is noteworthy that a substantial decrease 
in suggestions for improvement occurred be- 
tween Condition II and Condition I and that 
a somewhat larger and significant decrease 
occurred between Condition II and Conditio 
IV. Apparently the reflectorized treatment is 


172 


Marvin D. 


Dunnette 


TABLE 1 
NUMBERS AND PERCENTAGES OF Drivers WHO REPORTED EXPERIENCING SOME 


DIFFICULTY TRAVERSING THE INTERCH: 


NGE 
Station A Station B 
Y Had No 
Had Had No Had lac 4 
Difficulty Difficulty Difficulty Difficulty 
tage 
Condition N Percentage N Percentage N Percentage N Percentag 
99 
I. Fully Illuminated 2 2 84 98 1 1 183 à 
II. Dark 10 14 62 86 3 2 185 
7 2 98 
III. Standard Delineation 8 11 67 89 2 21 129 
IV. Experimental N "m 
Reflectorization 0 0 67 100 5 4 130 
V. Combined Illumination d 
and Reflectorization 2 4 50 96 1 1 146 
CS frequencies 
Note.—For Station A, x* 215.4; p <.01. For Station B, the x? test is inappropriate because of extremely low cell frequ 


in the "Had Difficulty” column. 
effective in offering both adequate visibility 
and guidance. 

A study of the actual suggestions made by 
those motorists who offered them gives fur- 
ther meaning to these results. Under Condi- 
tion I, the major suggestion was that more 
Signs be placed at the intersection; a few 
motorists also suggested the use of markings 
Such as arrows on the pavement, markers 
along the side of the road, and more vivid 
center stripes. Under Condition II, the major 
complaint apparently was caused by the 


; ill men- | 
darkness. Although some motorists still m 


: ; rkings: 
tioned the need for more signs and ma inder 


most simply said “Turn on the lights. ans 
Condition III, suggestions for improve nder 
included most of the factors mentioned si det 
the first two conditions. Suggestions ! (as 
Condition IV were fewer in amibe pst 
shown in the tables), and seemed t ie 
more specific than those offered un were 
first three conditions. Fewer suggestions. illu- 
offered under the combined conditions da a 
mination and reflectorization than unde 


TABLE 2 
NUMBERS AND PERCENTAGES oF Drivers WHO OFFERED SUGGESTIONS FOR IMPROVING 


INTERCHANGE VISIBILITY AND/OR M 


ARKINGS 


"€— 
Station A Station B^ -— 
o 
Offered Offered No Offered Offered Na 
Suggestions Suggestions Suggestions SURES —. 
tage 
Condition N Percentage N Percentage N Percentage MN  Percen 
I. Fully Illuminated 26 30 60 70 16 15 89 85 
IL. Dark 34 AT 38 53 22 20 80 80 
III. Standard Delineation 27 36 48 64 16 20 66 so 
IV. Experimental 7 
Reflectorization 13 19 54 81 9 13 63 a 
V. Combined Illumination o 
and Reflectorization 7 13 45 87 10 10 94 i 


Note.—For Station A, x? —21.6; p<.01. 
? At Station B, this question was asked onl: 


For Station B, x?=5.9; 5.05. 
y of through-traffic drivers on Highway 61. 


N TABLE 3 
UMBE " T . 
ed: AND PERCENTAGES OF THROUGH-TRAFFIC 
a SAYIN DU ~. T 
ds Savio Try Coup or Coutp Nor 
DENTIFY AREAS OF MERGING TRAFFIC 


Could Could Not 
Identify Identify 
g Merging 
Areas 
Per- Per- 
EIA » be 
Condition N Sus N pps 
L Fully - 
3 is Illuminated 92 90 10 10 
üt za 102 94 7 6 
iy P9 dard Delineation 66 so no N 
* experimental 
V. celiectoriztion 69 96 3 4 
à an ined Illumination 
eflectorization 101 97 3 3 
Note.—y2 


* Seven Mig 16.0; p «01. . . 
ough-traffic drivers failed to answer this question. 


ot . 
a condition This is evidence that a large 
ity of drivers believed both visibility 
a to be adequate. o 
e of the experimental condition, 
Minnesot majority of drivers exiting onto 
toute mart 36 from U. S. 61 believed the 
tion 4 arkings gave them adequate informa- 
rivers Sat wipra to turn. The percentage of 
Under ç aying this ranged from a low of 90% 
for co onitions IT and TIT to a high of 97% 
ecause ition IV, Most drivers, apparently 
chan of their familiarity with the inter- 
fio, 86; already knew where to turn. In addi- 
ing ? , OWever, it appears that the sign indicat- 
Approaching turn was à primary 
Of guidance for drivers encountering 
t three conditions. Under the Experi- 
Stem 1 eflectorization, however, the sign 
m ess important, and the delineator and 
fte R] treatments were mentioned more 
Dess his could be due partly to the “new- 
Dos the experimental treatment. It is 
$0 Shan Pat the pavement colors stood out 
i sug as to attract driver attention and 
f°" the to a greater degree than might have 
pant lar Case, had the drivers been more 
tion With the Experimental Reflectori- 


a 


Sou Tc 
the firs 
Menta] 


Com arp] 


ter corists who were driving through the 
ate it from south to north on U. S. E 
A Sug "ly unanimous in their belief that the 
Mothe Toute was sufficiently well marked. 
" significant need for through motor- 


Driver Opinions and Reported Perjormance 


173 


ists, however, is to be clearly aware of areas 
of exiting and merging traffic. These are 
critical response areas for the motorist and it 
is in and near these areas that improved visi- 
bility and guidance may be most important. 
Data shown in Tables 3 and 4 give informa- 
tion about identification of these areas under 
the various experimental conditions. 

It may be noted that areas of merging and 
exiting traffic were recognized by a high ma- 
jority of drivers. The highest degree of recog- 
nition occurred under Conditions IV and V. 

The interview schedules also requested in- 
formation about the methods used by drivers 
in recognizing areas of merging and exiting 
traffic. Over half the drivers under the last 
two conditions mentioned the colors on the 
pavement and on the delineators as important 
sources of information. Few drivers (just over 
1%) under the reflectorized conditions men- 
tioned traffic flow as giving them evidence 
about merging and exiting areas; under the 
first three conditions, about 10% identified 
traffic flow as their major source of informa- 
tion. It is evident, therefore, that many driv- 
ers (over half) did associate the Experimental 
Reflectorization treatment with the identifica- 
tion of areas of merging and exiting traffic. It 
is difficult, however, to judge whether or not 
this is of practical importance. Even under 
the Dark condition, 947% of drivers success- 
fully identified areas of merging traffic; one 
may well question, therefore, whether the in- 
crease in successful identification to 97% for 


TABLE 4 


NUMBERS AND PERCENTAGES OF THROUGH-TRAFFIC 
Drivers Saviwo THEY Cour». oR Covip Nor 
IDENTIFY AREAS OF EXITING TRAFFIC 


Could Could Not 
Identify 
Exiting 
Areas 
Per- 
. cent- 
Condition N age 
7, Experimental y 
IV- Refectorization 71 6o i i 
> Combined Illumination 
d: faa Reflectorization 100 96 á à 


P is tion was not asked of drivers 
Note.— This question was © ers under the fi 
nNotGcFerimental conditions. Hence mo meaningful compa 


son may be made. 


rst 
ari- 


174 


Condition V is of any practical consequence; 
further research is needed on this question. 

As explained previously, the interview 
Schedules were designed, in part, to elicit 
opinions and impressions from motorists con- 
cerning the Experimental Reflectorization em- 
ployed in Conditions IV and V. 

A large majority of motorists recognized 
intended relationships among the various 
markings. For example, over a third of the 
motorists noted that the blue of the exit 
ramp matched the blue of the sign indicating 
the location of the exit. It also was common 
for motorists to associate the amber or yellow 
colors of the pavement and delineator treat- 
ment with sLow or CAUTION. There was 
nearly unanimous agreement that the reflec- 
torized treatment was helpful in driving. 
Many motorists volunteered comments indi- 
cating a generally favorable attitude toward 
this particular experimental treatment. 


DirscussioN 


This study was undertaken in order to 
study driver performance and opinions under 
different conditions of night visibility and 
with the use of various highway marking 
systems. Motorists taking part in the study 
were, as a group, highly familiar with the 
interchange chosen for Study; and were in a 
position to offer informed opinions concern- 
ing the effects of the several experimental 
conditions employed. 

Since differences in driver opinions and re- 
ported performance were obtained under the 
various conditions, it is likely that drivers are 
aware and concerned about different night 
driving conditions. Opinions obtained from 
drivers in this study suggest that they are 
more confident, have less difficulty, and have 
a better opportunity to do a good job of night 
driving when visibility and guidance are im- 
proved either by illumination, reflectorization, 


Marvin D. Dunnette 


or both. More drivers experienced difficulty s 
traversing the interchange and more p 
made suggestions for improvements under 
Dark and Standard Delineation conditions 
than under the other three experimental con 
ditions. 

TThe results of the study also provide Fe 
concerning the possible effects on e fea 
ing performance of the Experimental d V. 
torization employed in Conditions IV a a 
It appears that the reflectorization trea oe 
is readily related by the motorist to Cè 
night driving needs. For example, 


tor- 
1. A significantly smaller number of we 


ists made suggestions for improvements 7 ull 
Condition V—the combined condition O riza- 
illumination and Experimental Reflecto ndi- 
tion—than under any of the other four waking 
tions. The proportions of motorists ndis 
suggestions increased progressively for Fully 
tions of Experimental Reflectorization, ? 
Illuminated, Standard Delineation, an d Ex 

2. Conditions of Fully Illuminated an ually 
perimental Reflectorization appeared ( drive! 
effective in reducing the incidence O 
difficulty in traversing the intersection. |. ons 

3. Over half the drivers under Co torias: 
IV and V identified the pavement reflec nd/ot 
tion as indicating areas of merging 4 
exiting traffic. jority 

4. It was the opinion of the large Ww the 
of drivers under Conditions IV and V t effec 
Experimental Reflectorization was an night 
tive and helpful means of providing 
driving guidance. 


est 
The over-all results of this study suon 


that reflectorization as well as ie 
can be regarded as an effective means pttime 
ducing driving problems related to nig 
visibility conditions. 


(Received July 8, 1960) 


of Applied Ps ychology 


Journal 
Vol. 45, No. 3, 175-178 


1961, 


RESPONSE SET AND THE PREDICTION OF CLERICAL 
JOB PERFORMANCE 


PHILIP H. KRIEDT 


P " 
rudential Insurance Company 


Consi 

en De attention has been given re- 
Sonality a question of response set in per- 
Much of th asurement. It has been shown that 
self-descri i^ variance in a number of common 
erms E ion questionnaires is interpretable 
nume inne set rather than item con- 
1956; Mer 1956: Edwards, 1957; Hanley, 
en ace & Jackson, 1958). Research 
of variou NM mainly at the identification 
ability a kinds of set or bias, social desir- 
lechnj ues acquiescence for instance, and at 
Scales cą S for controlling set so that content 
era] Sdn be more meaningful. Although sev- 
waiters including Cronbach (1950) and 
Stances y (1957) have noted that in some in- 
Provide cM set scores might themselves 
rally p alid measures, this point has gen- 

os t been emphasized. 
Concern H construction of personality tests 
Concern us response set, if there has been any 
to contro] all, has usually resulted in attempts 
Correctio it. This has been done by the use of 
Scale of th or suppressor keys such as the K 
tha are e MMPI, by the avoidance of items 
Subtle» related to response sets as in the 
(Wiener keys developed for the MMPI 
forms » 1951), by use of forced-choice item 
Such 3 by special administrative techniques 
Mend Ny “side by side” method recom- 
Sick (1953 Voas (1958). Jackson and Mes- 
seed for d), however, have pointed out the 
Donse s eliberate attempts to increase re- 
"lieto et as a way for finding valid pre- 
Argued Of behavior, and Berg (1957) has 
impo; , P eovocatively that response set is the 
Ae nr factor to measure in personality 
nbn and that item content is of little 
abstract ed: He has used a test of meaningless 
n. DSe set esigns to measure “deviant” re- 
Mal an a has found that it differentiates 
in UPport deviant behavioral groups (1955). 
th this arti for Berg's viewpoint is presented 
e Goatticle which reports a study in which 
on Personal Inventory was given to 


n 


AND ROBERT I. DAWSON 


Equitable Life Assurance Society 


a group of clerical workers. The study pro- 
vides the rather ironic finding that the in- 
ventory, a forced-choice test, was successful 
in predicting job performance ratings not by 
controlling response set but rather because it 
permitted response set to affect the scores. 


A DESCRIPTION OF THE SCALE 


In constructing the Personal Inventory 
Gordon (1953) wanted to use a forced-choice 
item format, but he was also concerned with 
making the test as acceptable as possible to 
test takers. He therefore selected the tetrad 
forced-choice form in which two equally fav- 
orable phrases are paired with two equally 
unfavorable phrases. The respondent is then 
asked to check one phrase as most like him 
and another as least like him. One is not com- 
pletely forced, consequently, to prefer one of 
the two items of equal preference value and 
as Berkshire and Highland (1953) have 
pointed out in their review of forced-choice 
rating procedures, this type of item is apt to 
permit considerable response bias. 

The Personal Inventory yields four meas- 
ures which are called: Cautiousness, Original 
Thinking, Personal Relations (trust and con- 
fidence in others), and Vigor. Each trait is 
represented once in each tetrad. Each trait is 
described approximately 10 times by compli- 
mentary phrases and 10 times by uncom- 
plimentary phrases. Tf the respondent marks 
a complimentary phrase as most like him or 
an uncomplimentary phrase as least like him, 
he gets a +1 on that scale. If he marks a 


complimentary phrase as least like him or an 
uncomplimentary phrase as most like him, he 
gets a —1 on that scale. For each tetrad, he 
may score plus on two traits, minus on two 
traits, or a plus on one scale and a minus on 
one scale. The highest possible score on any 
scale is +20 and the lowest possible score 
— 20. A fifth score, called the Total score, is 


obtained by adding the four trait scores alge- 


175 


176 


braically. Total score may vary from +40 to 
—40. Total score, it is important to note, is 
strictly a response set score measuring the 
respondent's willingness or unwillingness to 
check unfavorable responses. If he checks 
only socially desirable phrases as most like 
him and socially undesirable items as least 
like him his total score is +40. If he checks 
only socially undesirable phrases as most like 
him and socially desirable items as least like 
him his total score is —40. Also, each of the 
four trait scores depends partly on whether or 
not the respondent is willing to say unfavor- 
able things about himself. If he always says 
favorable things, the trait score must fall be- 
tween O and +20. If he checks some unfavor- 
able replies, a trait score may be negative. 


RESULTS WITH OFFICE EMPLOYEES 


As a part of a tryout of several personality 
tests, the Gordon Personal Inventory was 
completed by 41 employees on beginning level 
jobs in an insurance company. The group was 
composed primarily of women with from 1 to 
$ years of service who were told that they 
were taking the test for experimental pur- 
poses and that it would in no way affect their 
job status. The immediate superiors of these 
employees ranked them on several job per- 
formance factors, and an overall performance 
evaluation for each employee was derived 
from these rankings. 

First of all, it is of interest to note the 
extent to which clerks will check uncompli- 
mentary self-description phrases in a business 


TABLE 1 


CORRELATIONS AMONG Five SCORES or GORDON PERSONAL INVENTORY 
AND RATINGS OF CLERICAL JOB PERFORMANCE 


Philip H. Kriedt and Robert I. Dawson 


setting. In this instance, they did so over one- 
fifth of the time. Of the 1,640 responses 
checked by these 41 women, 360 or 22% were 
instances of a complimentary phrase checked 
as least like her or an uncomplimentary 
phrase checked as most like her. No one 
endorsed all the socially desirable responses. 
The highest total score was +38, and the 
lowest —18. The social desirability of all 
items for this group was the same as for 
Gordon's experimental group, that is, there 
were no instances in which the majority of the 
group preferred an uncomplimentary phrase 

Since these clerks frequently endorsed both 
complimentary and uncomplimentary phrases, 
each of the four trait keys is related to the 
Total or social desirability score. The inter- 
correlations among the four trait scores s 
the Total score and the correlations of me 
scale with the rating of job performance m 
by the supervisors are shown in Table 1. a 

All four ‘trait scores are highly related : 
the Total or social desirability score. i 
though all four trait scores are moderately 
related to the criterion, these validities may 
be due to the social desirability factor 1n mn 
trait score rather than the forced-choice ie 
measurement. The highest validity is for a 
Total or social desirability score. When T 
score is partialed out of the trait score ee 
ties, it can be noted in Table 1 that t E 
validities shrink considerably and instea x 
all four being positive, two are positive a 
two negative. ! 

It appears that the success of the inventory 


(N — 41) 
Partial f | 
with Rating g | 
Original Personal Total Score He 
Thinking ^ Relations Vigor Total Rating CDM ee 
Cautiousness 32 51 10 68 38 ae 
Original Thinking 35 52 83 31 = 
Personal Relations —.02 .62 22 =o 
Vigor 63 37 1 
Total score AT 


Note.—For .05, 77.30; for p.a, 7.39, 


Response Set and Clerical Performance 


I Pedi ting clerical performance in this 
ity E lon is due mainly to the social desirabil- 
ed oe set which affects all the scores. 
inter results Suggest that one should not 
of nc validities obtained with trait scores 
m ni ordon Personal Inventory as neces- 
ac eing trait validities. Trait score valid- 
nó eyi d be due to response set. Gordon gives 
bent ence of validity in his manual for the 
n ed but for the profile, he reports data 
deput o validity studies. One study is with 22 
relate sheriffs and all four trait scores cor- 
and A LAUR (correlations of .21, .50, 25; 
total ) with buddy ratings. No validity for 
validitic = is given, but since all four trait 
Score are positive, it is clear that the total 
ig i ec have moderately high validity. It 
Scores ain that the validities for the trait 
e s ake due partly and perhaps mainly to 
Shady aa desirability response set. Another 
epart is reported comparing ratings of 30 
Scores pm store salespeople and profile 
signife alespeople with superior ratings have 
ow mie higher scores than those with 
is li n ings on all four trait scores. Again, it 
in lar Y that all of these differences are due 
"ius part to the social desirability response 
etwe ce total score shows a similar difference 
ance K» the groups. The results with insur- 
Perha erks reported here may not be unusual. 
the aps much of the predictive strength of 
ability don tests lies, paradoxically, in their 
Dong to measure a social desirability re- 

€ set. 
apjqrother way of holding the social desir- 
validi, factor constant in determining the 
eys Md of trait scores is to score the trait 
te i a different manner. Trait scores which 
àbi ite ually independent of the social desir- 
follow response set can be obtained in the 
Nini. way. Divide the tetrads into pairs 
b bus Du and unfavorable phrases. If one 
Marked in a pair of favorable phrases 15 
anq <, Most like," assign +1 for that trait 
Wheth, 1 for the other trait regardless of 
Rot son it has been marked “least like” or 
z pa, d at all. Similarly, if one phrase in 
like» Ae favorable phrases is marked "least 
«ther Bi —1 for that trait and +1 for the 
Ost ait regardless of whether it is marked 
ike” or not marked at all. Pairs of 


177 


unfavorable phrases would be scored in the 
same manner but in the opposite direction. If 
both phrases in a couplet are unchecked they 
would be scored zero. The algebraic total of 
trait scores obtained in this way is necessarily 
zero and a response set score cannot be ob- 
tained from such trait scores. 

The writers rescored the 41 inventory tests 
in this manner and obtained correlations for 
the four trait scores vs. job performance rat- 
ings that were almost identical to the trait 
score validities with Total partialed out re- 
ported in Table 1. Validity coefficients ob- 
tained for the revised trait scores were: 
Cautiousness .08, Original Thinking —.04, 
Personal Relations —.09, Vigor .04. 


SUMMARY 


A comparison of Gordon Personal Inven- 
tory scores and job performance ratings for a 
group of insurance clerks showed that three 
of the four trait scores and also the Total 
score, a social desirability response set score, 
had significant positive correlations with the 
job performance criterion. When Total score 
was partialed out of the trait score validities, 
however, these validities disappeared. Also it 
was found that when trait scores which are 


free of response set were obtained by a re- 
ire, the four trait scores 


vised scoring procedu 
had zero validities. Results for two studies 
his manual for the 


reported by Gordon in 
profile, a forced-choice test similar to the 


inventory, suggest that response set may ac- 
count for the validities reported. 

This study furnishes an illustration of the 
possible value of response set as a predictor 
measure. Response set is not necessarily a 
nuisance factor in personality measurement 
and therefore something which should be 
eliminated. Measurement of response sets 
may prove to be a valuable approach to per- 
sonality assessment. 


REFERENCES 


Barnes, E. H. Response bias and the MMPI. J. con- 
sult. Psychol., 1956, 20, 371-374. 

Berc, I. A. Response bias and personality: The 
deviation hypothesis. J. Psychol., 1955, 40, 61-72 

Berc, I. A. Research notes from here and there. 
J. counsel. Psychol., 1957, 4, 154-160. j 


178 


BERKSHIRE, J. R., & Hicran, R. W. Forced-choice 
performance rating: A methodological study. Per- 
sonnel Psychol., 1953, 6, 355-378. 

CnoxBACH, L. J. Further evidence on response sets 
and test design. Educ. psychol. Measmt., 1950, 10, 
3-31. 

Epwanps, A. L. The social desirability variable in 
personality assessment and research. New York: 
Dryden, 1957. 

Gorpon, L. V. Manual for Gordon Personal Profile. 
Yonkers, N. Y.: World Book, 1953. 

Hantey, C. Social desirability and responses to items 
from three MMPI scales: D, Sc, and K. J. appl. 
Psychol., 1956, 40, 324-328. 


Philip H. Kriedt and Robert I. Dawson 


Jackson, D. N., & Messick, S. J. Content and style 
in personality assessment. Psychol. Bull., 1958, 55, 
243-232. 

Messick, S. J., & Jackson, D. N. The measurement 
of authoritarian attitudes. Educ. psychol. Measmt., 
:1958, 19, 241-253. 

Voas, R. B. A procedure for reducing the effects of 
slanting questionnaire responses toward social ac- 
ceptability. Educ. psychol. Measmt. 1958, 18, 
337-346. 

Wiener, D. R. Subtle and obvious keys for the 
Minnesota Multiphasic Personality Inventory. J. 
consult, Psychol., 1951, 15, 134-141. 


(Received July 11, 1960) 


Journal of Appli 
Applied Psychol 
1961, Vol. 45, No. 3, 129-185 


DEVELOPMENT AND VALIDATION OF SYNTHETIC 
DEXTERITY TESTS BASED ON ELEMENTAL 
MOTION ANALYSIS' 


DONALD W. DREWES 


Industrial Psychology Center, North Carolina State College 


E difficulties involved in choosing appro- 
Siac dexterity tests for use in personnel 
studi ion have been exemplified in numerous 
Bours (Candee & Blum, 1937; Ghiseli & 
ey 1955, pp. 218-235; Treat, 1929; 
CE 1932). Dexterity tests thought to be 
by oth to job efficiency often were surpassed 
ar al dexterity tests which appeared to 
con similarity to the jobs in question. 
some rity tests which were found’ to bear 
one „relation to successful performance on 
jobs Job often failed to generalize to other 
sam which appeared to require essentially the 
€ abilities. 
s E “pral studies of manual dexterity have 
e n that human motor performance cannot 
(yis unted for by a single ability factor 
gap man & Hempel, 1954, 1956; Harrell, 
Sugge Seashore, 1951). Rather, the evidence 
mm the presence of narrow bands of 
oman. group factors. This specificity of the 
ak ituent factors of complex motor abilities 
cu. general prediction of job success dif- 
- Motor abilities required for the success" 
ath lormance of jobs of apparently similar 
pie vary, thereby necessitating the 
job b tion of present dexterity tests on each 
Cann Y conventional methods. Since validity 
Situ. 9t be adequately generalized from one 
ege to another, the test developer is 
cach to tailor-make a testing program for 
ue. Situation with little but al educated 
line as to what tests will meet the local re- 
ements, 
Ael dexterity tests current] 
i of single eleme 
Cause ited combinations of these MOOT”. 1 
es Of the limited motion patterns utilized, 
ent dexterity tests often do not match the 


1 

Thi A 

faculty ^, Paper i << submitted to the 
ul is based on a thesis 5 

the E 9f Purdue University in partial fulfillment of 

Was surements for the PhD degree. The resear 

Farc PPorted by funds granted by the Purdue Re- 


h Foundation. 


y in use meas- 
ntary motions 
tions. Be- 


patterns of motion that characterize many in- 
dustrial jobs. As Griffin (1957) points out, 
dexterity tests of the pegboard type have an 
inefficient layout of the testing task resulting 
from pins, collars, and washers being placed 
in recessed bins instead of arranged in an arc 
corresponding to the layout of a benchwork- 
er's task. 

The purpose of this research is the develop- 
ment of a series of dexterity tests which in- 
corporate many of the motion patterns used 
in bench-assembly jobs. It is hypothesized 
that the predictive validity of a test which 
essentially duplicates or simulates the se- 
quence of motion elements used on a job is 
greater than the validity of a test that does 
not intentionally simulate the sequence of 
motion elements. Since the specificity of 
motor skills has tended to limit the predictive 
validity of conventional dexterity tests, the 
simulation of actual motion patterns used on 
the job may offer new possibilities for utiliz- 
ing psychomotor tests as job predictors. 


METHOD 


Test Development 


A system of work elements applicable to a wide 
variety of manipulative jobs was selected as a model. 
Because factors such as distance reached, shape and 
size of object moved, ease of handling, and type of 
grasp conceivably might require specific manual abil- 
ities, only systems which included these factors were 
considered. The system that best met this criterion 
was the Methods-Time Measurement (MTM) sys- 
tem of predetermined times (Maynard, Stegemerten 
& Schwab, 1948). The MTM system utilizes the fol 
lowing seven motion elements: Reach, Move, Turn. 
Grasp, Position, Disengage, and Release. 

The test design thought to be best characterized in 
terms of the MTM system was a modification of the 
conventional pegboard. A pegboard design was 
chosen because the MTM motions of Reach, Grasp 
Position, and Release of object being assembled ae 
present in almost every assembly job and are essen- 
tially the same motion elements as those involved in 
placing pins in a board. 


179 


180 Donald W. Drewes 


In order to duplicate the MTM element Position, 
three variables were taken into account: class of fit, 
symmetry, and difficulty of handling of object. Class 
of fit according to MTM definition is determined by 
clearance between pin and hole and pin dimension. 
Symmetry between pin and target varies from in- 
finite assembly possibilities to a single assembly posi- 
tion. Ease oí handling is dependent upon size and 
shape of object being assembled. 

The effects of symmetry were duplicated by de- 
signing boards with three shapes of holes—round, 
square, and pentagonal. The boards having round 
holes simulated operations in which a pin could be 
inserted in any of an infinite number of positions, 
boards having square holes represented operations in 
which a pin could be positioned only four ways, and 
boards having pentagonally shaped holes represented 
operations in which a pin could be positioned and 
inserted only one way. The latter positional restric- 
tion resulted from the fact that all the sides of the 
pentagon were not of equal length. 

Variation in the ease of handling was accomplished 
by designing pins of two lengths, one length corre- 
sponding to the length necessary to qualify an object 
as being easy to handle and a longer length sufficient 
to qualify an object as being difficult to handle ac- 
cording to the MTM standards, 

The MTM system has specific requirements con- 
cerning which MTM elements are appropriate, given 
a particular combination of pin dimension and clear- 


ance. Since it was not feasible to duplicate all pin 
sizes, four sizes were selected. Appropriately shaped 
pins were designed for each board, with the excep- 
tion of the smallest dimension of the pentagonally 
shaped pin and several intermediate dimensions of 
the square pins which were excluded. The exact 
tolerances made the cost of machining the smallest 
pentagonal pins prohibitive, and productional at 
culties prevented the building of a complete set © 
square boards. É 

The clearance factor was accounted for by design- 
ing two boards for certain combinations of pin shape 
and size. One board corresponded to loose tolerance 
and the other board corresponded to tight tolerance 
as specified by the MTM system. 

The simulation of bimanual operations was Lus 
plished by designing the boards as trays filled with 
small blocks. In this manner, the blocks could be 
removed from the trays and handled separately. 

The experimental test model as designed consiste 
of 14 boards and 18 sets of pins. The complete as- 
semblage of boards and pins was entitled the Pura 
Elemental Motions Tests (PEMT). A dimension? 
description of the PEMT is presented in Table ae A 

Each board of the PEMT is essentially a woo = 
tray filled with 30 wooden blocks arranged in ensi 
rows of 10 blocks each. The outer dimension of uo 
tray is 161 X 5§ X 18 inches, and the inner dimen: 
sion is 151 X 44 X } inches. Each wooden block 5 
13 1} X 1 inches with a hole } inch in the center 


TABLE 1 


DIMENSIONAL DESCRIPTION OF THE PEMT 


Boards Ping d 
a e 
a . Dimension Length aei ign 

Classification of Holes Size Short Long Loose Tight 
Round 

R1 5/64 1/16 m 2 1/64 

R2 15/64 1/32 1 2 1/64 

R3 21/64 5/16 1} 2 1/64 

R4 3/8 5/16 1 2 1/16 

R5 41/64 5/8 n 2 1/64 

R6 11/16 5/8 1 2 1/16 
Square 

S1 5/64 1/16 1k 2 1/64 

S5 41/64 5/8 1} 2 1/64 
Pentagonal 

P1 15/64 7/32 14 2 1/04 

P2 10/32 7/32 } 2 3/32 

P3 21/64 5/16 + 2 1/64 

P4 13/32 5/16 i 2 3/32 

P5 41/64 5/8 1 2 1/64 

P6 11/16 5/8 1 2 1/16 


Note.—Dimensions are in inches. 


Validation of Synthetic Dexterity Tests 181 
TABLE 2 
COMPOSITE CONTINGENCY TABLES 
WN = 72) 
Most Appropriate PEMT Least Ap- Minnesota 
= ropriate R 
Gated Variation 1 Jariation 2 PPEMT Vide ennt 
Clas; " " riterion a 
sification Groups Low High Low High Low High Low High 
All Jot 
^" High 23 13 24 12 19 17 18 18 
Low 13 23 12 24 17 19 18 18 
x 4.530* 6.722** .056 0.000 
" $ .251 .306 .090 .000 
Ssem| 
bly Jobs High 11 5 13 3 11 5 8 8 
Low 5 il 3 13 5 11 8 8 
x* 3125* 10.125** 3.125* 0.000 
N ó 312 .563 312 .000 
ona, 
Ssembly Jobs High 12 8 11 9 8 12 10 10 
Low 8 12 9 11 12 8 10 10 
x .900 .100 .900 0.000 
$ .150 .050 —.150 000 


* Sign} 
l 
* Significant at .05 level, one-tailed test. 
icant at .01 level, one-tailed test. 


Each 

Table p boards designated as P1 through P6 (see 

With a i contains blocks which have square holes 
riangular insert placed in one corner. 


Reliabitity Analysis 


Reliahj 
i ability " " * 
In bility estimates were obtained for eight test- 


E tasks i : 
reached, involving various combinations of distance 
bins, Fo Shape of pins, length of pins, and size of 


two Pu testing tasks were administered to each of 
The fines (N = 39, 30) of male college students. 
and last required for the testee to insert the first 15 
"liabitit 15 pins was recorded for each task, and 

Y coefficients were computed. 


Validar 
‘dation Procedure 


The vali 
em€ Validation sample consisted of 72 female shop 


manume of a large communications equipment 
uring company. Nine jobs were represented 
: n each of the nine 

Ni exteri job was so chosen because 2 high degree 
Ib y Srity appeared to be essential for successful 
"vo iormance. All of the jobs were bench jobs 
e manual rather than machine operations. — 
a stéHon was an efficiency index which indi- 
sir era productivity in relation to stand- 
ers ished by the company. When two or more 
Es, is the same job received identical efficiency 
rs tae ties were broken by having the super- 
The $ the tied workers. 
ample group (NV =8) from each job con- 

i Our workers having above-average effi- 
ndices and four workers having below- 


Cate 


average efficiency indices. Whenever possible, the 
workers chosen represented the highest and lowest 
extremes of the performance continuum for each job. 
However, on certain jobs the restricted range of ef- 
ficiency indices and the limited number of workers 
available resulted in a rather narrow differential be- 
tween the indices of the workers chosen. Workers 
who had been on the job less than 3 months were 
excluded from the sample. The efficiency indices were 
ranked for each job and dichotomized at the median, 

Four testing tasks were presented to each of the 
72 subjects, and the time taken to complete each task 
was recorded. Three of the tasks were variations of 
the PEMT; the fourth was the Minnesota Rate of 
Manipulation Test (MRM), a conventional dexterity 
test. Because of the complexity of the jobs, two vari- 
ations of the PEMT were administered in order to 
simulate better the motion patterns characterizing 
each job. These two variations were referred to as 
the Most Appropriate PEMT, Variation A, and the 
Most Appropriate PEMT, Variation B. In order to 
test the hypothesis that validity was a function of 
the similarity between testing task and job, a third 
variation of the PEMT, whose motion patterns were 
purposely mismatched with those of the job, was 
used. This variation was referred to as the Least 
Appropriate PEMT. The fourth test, the turning test 
of the Minnesota Rate of Manipulation Test, was in- 
cluded because it was thought to be representative of 
a conventional dexterity test requiring motion pat. 
terns of a different type than those required by the 
PEMT. mnn 

The most and least appropriate variations of the 


182 


PEMT were determined for each job. The selection 
of the testing tasks to be used for each job was de- 
pendent upon the motion elements used in the per- 
formance of each job. Particular attention was given 
to such factors as tolerances of objects requiring 
assembly, shape and size of objects handled, distances 
reached, and the type of grasp required to handle the 
object. Given this information, the variations of the 
PEMT which appeared to offer the best chance for 
simulation of the job motions were selected. Con- 
versely, the information concerning each job served 
as a basis for the selection of the particular variation 
of the PEMT thought least likely to simulate actual 
job motions. 

The information concerning the motion patterns of 
each job was secured by consultation with the indus- 
trial engineer in charge of setting the standards for 
each of the nine jobs. Since the engineer was familiar 
with the motion patterns of each job, he was able to 
suggest rather unique ways of simulating the actual 
job pattern. 


Statistical Analysis 


Median tests were used in the evaluation of the 
relationship between test scores and criterion. The 
high criterion and low criterion groups on each job 
were classified according to the number above or 
below the median test score for that job. The rela- 
tionship between each of the four tests and the 
criterion was evaluated by combining the individual 
contingency tables for each job into a single con- 
lingency table and computing the phi coefficient. The 


Donald W. Drewes 


same procedure was followed for a subset of the nine 
jobs comprising four jobs which were essentially as- 
sembly-type operations. ` 

The subset of four jobs classified as assembly jobs 
generally conformed to the stereotyped conceptual- 
ization of bench-assembly jobs. The jobs classified a 
nonassembly, on the other hand, involved za 
operations which were not primarily of an assembly 
nature. Several of these jobs, for example, involve 
the manipulation and soldering of flexible wae 
Other nonassembly jobs required the inspection E^ 
adjustment of small electrical units rather than the 
assembly of small parts. ato 

Since two variations of the PEMT were use d 
simulate the motion patterns of cach job, S 
means of deriving a composite total score for t I 
two tests was needed. Although the problem y 
analogous to multiple regression in parametric a 
tistics, no comparable method was known for ue 
parametric methods. Therefore, a trial and ert 
method of determining appropriate weights was used. 
The time scores on the Most Appropriate B a 
variations were transformed into z scores. Sert i 
weighting schemes were tried and phi cosain 
computed. The weighting scheme selected was n 
one which resulted in the highest coefficient pera 
the weighted score and the criterion when comp 
over the nine jobs. 


RESULTS 


: n S 
The validation results are shown 1n mes 
2 and 3. As is readily seen in Table 2, 


TABLE 3 
COMPOSITE CONTINGENCY TABLES RESULTING FROM THREE WEIGHTING PROCEDURES 
APPLIED TO VARIATIONS 1 AND 2 OF THE Most Appropriate PEMT 


ee 
All Jobs Assembly 
Composite Jobs Compos- 
Score ite Score 
; Criterion : 
Weighting Procedure Groups Low High Low High 
Variations 1 and 2 Given High 21 15 11 5 
Equal Weighting Low 15 21 5 Hu 
x 1.390 3.125* 
$ .138 :312 
Variation 2 Given Twice High 24 12 13 2 
the Weight of Variation 1 Low 12 24 3 13 
xt 6.722** 10.125** 
$ .306 563 
Variation 2 Given Three Times High 23 13 13 g 
the Weight of Variation 1 Low 13 23 3 13 
xt 4.530* 10.125** 
$ 251 .563 


* Significant at .05 level, one-tailed test, 
** Significant at .01 level, one-tailed test. 


— 


Validation of Synthetic Dexterity Tests 


oP A and B of the Most Appropriate 
comi exhibited significant validities when 
ae for all nine jobs. The Least Appro- 
D itm PEMT and the MRM did not exhibit 
Meet relationship between job pro- 
a icy and test scores when computed over 
nme jobs. 
cem the subset of four assembly-type jobs 
the ] onsidered separately, both the most and 
iur appropriate tests of the PEMT were 
The Y have significant validity coefficients. 
tionshi RM failed to show a significant rela 
When ip between test scores and criterion 
sidered. nly the assembly-type jobs were con- 
validity None of the four tests had significant 
Ns y coefficients when computed over the 
Prop jobs. 

the able 3 shows the contingency tables and 
difen alidity coefficients that resulted when 
a weighting procedures Were used to 
ine Variations A and B of the Most 

i PrOpriate PEMT. Three different weight- 
s Procedures were compared: Variations A 
tw; Siven equal weight, Variation B given 
ne the weight of Venison A, and Varia- 
tion A given three times the weight of Varia- 
Was gi, The procedure whereby Variation B 
c ie twice the weight of Varation A was 
Validit. because it resulted in the highest 
validity when considered over all the nine 
tried O other weighting procedures were 
Dtoceq ecause it was felt that no weighting 
Which Ure could produce à composite score 
than Would have a validity coefficient larger 
the , the largest validity coefficient between 
th, i dividual variations of the PEMT and 


e Criterion. 
TY he eight reliability estimates ranged from 
366. Since the 


Teligpo. .900 with a median of .860 i 
tude lity coefficients were of similar magni- 
» they were considered to represent à 


Clos 3 
ta © approximation to the reliability Pa- 
Meter, 


Discussion 


Tta 1 
tes E hypothesized that a series of dexterity 
Aum Which could be used to simulate a large 
er of motion patterns characterizing 


assembly-type jobs would result in 
2i han would a con- 


hie’ 
She : 
Ventig, Predictive validities t 
ae dexterity test involving à fixi 
n of motion. This hypothesis 1$ gen- 


183 


erally supported by the results of the valida- 
tion study. The validity coefficients of the 
MRM when computed over all jobs and when 
computed for assembly and nonassembly jobs 
separately were zero in every case. The va- 
lidities of the PEMT, on the other hand, 
appeared to generalize from one job to an- 
other. When evaluated over all nine jobs, 
both variations of the Most Appropriate 
PEMT expressed a significant relationship 
with the criterion. The fact that the relation- 
ship of the Least Appropriate PEMT over all 
nine jobs failed to be significant lent further 
support to the hypothesis. 

When the nine jobs were subdivided into 
four assembly and five nonassembly jobs, the 
Least Appropriate PEMT and both varia- 
tions of the Most Appropriate PEMT were 
found to have significant validities over the 
assembly jobs and nonsignificant validities 
over the nonassembly jobs. The magnitude of 
the validity coefficient for the Least Appropri- 
ate PEMT for the assembly jobs was larger 
than expected, since it was hypothesized that 
the Least Appropriate PEMT would have a 
lower relationship with the criterion than 
would the Most Appropriate PEMT. How- 
ever, this hypothesis is partially supported 
by the fact that the validity for Variation B 
of the Most Appropriate PEMT exceeded 
the validity of the Least Appropriate PEMT 
when computed only for assembly jobs. The 
significance of the difference between these 
validity coefficients could not be properly 
evaluated since no appropriate statistical 
technique was known. An approximation 
using the standard error of uncorrelated 
tetrachoric correlations computed from data 
dichotomized at the median indicated a sig- 
nificant difference between the validity co- 
efficients. The results, however, were accepted 
with reservation, since tetrachoric correlations 
based on small sample sizes are often un- 
stable. 

Variation B of the Most Appropriate 
PEMT appeared to be a more valid predictor 
of criterion performance than Variation A. 
The higher predictive validity of Variation B 
might possibly be interpreted as resulting 
from a transfer of training or practice effect 
as Variation A always preceded Variation B 
in the sequence of test administration. If such 


184 


an effect were operating, the practice received 
on Variation A would transfer to Variation B, 
causing Variation B to represent a more valid 
measure of the /rue time score than Varia- 
tion A. This possibility is expressed in the 
derived weighting procedure whereby Varia- 
tion B is given twice the weight of Varia- 
tion A. 

A significant result of the validation study 
was the finding that the PEMT could not be 
used effectively to duplicate nonassembly 
operations. If the PEMT is to be effective 
as a predictive instrument, it must be re- 
stricted to assembly-type operations until the 
coverage can be extended to include the 
motion patterns found on jobs involving 
activities other than the assembly of rigid 
parts into fixed location receptacles. 


CONCLUSIONS 


The results of this research indicate that 
the predictive utility of psychomotor tests 
may be substantially increased by developing 
tests which will more closely approximate 
actual motion patterns used on the job. 
Instead of concentrating on the macro aspects 
of overall job performance, it may be profit- 
able to divide the job into micro units and 
to develop means of predicting performance 
on appropriate sequences of these micro 
units. In this manner, it may be possible to 
synthesize predictors for particular jobs based 
on the microanalysis of the motion patterns 
involved. New jobs may conceivably be 
analyzed and the appropriate selection instru- 
ments designed before the job is actually in 
existence on the shop floor. Selection of tests 
on the basis of microanalysis of motion pat- 
terns may substantially reduce the amount 
of guesswork involved in the selection of 
predictive tests by providing the practitioner 
with a formal procedure for test selection. 

The results of this research represent only 
a small start toward meeting the challenge of 
synthetically developing psychomotor tests. 
The conclusions from this study are based 
on the results obtained from one of a vast 
number of industrial enterprises in our so- 
ciety, and are limited at present to bench 
jobs of an assembly nature. Notwithstanding 
these limitations, the approach advocated in 
this research appears to offer new and chal- 


Donald W. Drewes 


lenging opportunities for increasing the utility 
of psychomotor tests as predictors of manual 
job performance. . 

In summary, the following conclusions ^ 
pear warranted on the basis of the results 0 
this study: 


1. The predictive validity of a dexterity 
test designed to simulate a large number = 
motion patterns characterizing assembly: tyPt 
jobs will generally result in higher predictiv 
validities than will conventional dezteniy 
tests involving rather different patterns 0 
motion. ^ 

2. The selection of appropriate predici vi 
tests on the basis of job motion patterns ates 
the test practitioner a formal procedure 
the selection of dexterity tests. be 

3. The validities of certain tests may "t 
generalized to similar job situations witho* 
the benefit of conventional validation PT 
cedures. 

SUMMARY 


A series of pegboard tests entitled E 
Purdue Elemental Motions Tests B the 
was designed to incorporate many © Time 
motion elements used in the Methods- m 
Measurement predetermined time p^ ons 
Because the PEMT was thought to ° rie 
greater possibility for the simulation of “Je 
tion patterns of manipulative jobs than hy- 
many conventional dexterity tests, it M the 
pothesized that the predictive validity © the 
PEMT would be generally higher than a 
validity of conventional dexterity tests wi 
volving motion elements which were E 
tailor-made for the individual jobs. It is 
further hypothesized that a subtest of B 
PEMT selected to simulate the pattern ee 
motion elements would be a more v Tit 
dictor of job success than would a su is 
of the PEMT purposely selected to m 
represent the actual motion elements. " 

Validation on an industrial sample p 
ported the hypotheses. The validity of s 
PEMT exceeded that of the convent 5 
dexterity test when generalized over all p 
sampled and when generalized over a S ost 
of assembly jobs. The validity of the 4 the 
Appropriate PEMT variations exceeded MT 
validity of the Least Appropriate PE on 
when generalized over all jobs. It was © 


Validation of Synthetic Dexterity Tests 


clu " . 

ed that if dexterity tests could be vali- 

patter; on component sequences of motion 

k ns, untapped areas of application may 
uncovered. 


REFERENCES 


CANDe: 
a n & Brum, M. Report of a study done in 
Sant factory, J. appl. Psychol, 1937, 21, 572- 


Flas; 
EMAN. 
meae E. A, & Hempet, W. E. Jn. A factor 
i p dexterity tests. Personnel Psychol., 1954, 
EIST ATA 
analysis ? E. A, & Hemrre, W. E. JR. Factorial 
Telated of complex psychomotor performance and 
Gris Skills. J. appl. Psychol 1956, 40, 96-104. 
industria] E, & Brown, C. W. Personnel ang 
Psychology. (2nd ed.) New York: 
cGraw-Hil], e (2nd ed.) 


i 


185 


Gurrx, C. H. Application of motion and time 
analysis to dexterity testing. Paper read at Mid- 
western Psychological Association, Chicago, May 
1957. à 

Harrett, W. A factor analysis of mechanical ability 
tests. Psychometrika, 1940, 5, 17-33. * 

Mavxagp, H. B., STEGEMERTEN, G. J, & Scuwas 
J- L. Methods-time measurement. New Mork: 
McGraw-Hill, 1948. 

Seasnore, R. H. Work and motor performance. In 
S. S. Stevens (Ed.), Handbook of experimental 
psychology. New York: Wiley, 1951. Pp. 1341- 
1362. 

Treat, K. Tests of garment machine operators. 
Personnel J., 1929, 8, 19-28. 

vireres, M. S. Industrial psychology. New York: 
Norton, 1932. 


(Received July 15, 1960) 


Applied Psychology 
1961 val. 45, No. 3, 186-192 


SITUATIONAL EFFECTS ON A PROJECTIVE TEST' 


BERNARD MAUSNER 
Graduate School of Public Health, University of Pittsburgh 


When a psychological test of any kind is 
administered, the degree to which the results 
can be attributed to temporary situational 
effects is always in question. Although this is 
especially true for questionnaires and intel- 
ligence tests, there is some evidence that the 
needs of the moment will affect a TAT-like 
instrument in a predictable way (McClelland, 
Atkinson, Clark, & Lowell, 1953). In fact, 
this effect is made the basis for a measure of 
needs. It can certainly be demonstrated that 
many kinds of tests can be faked by subjects 
following instructions to adopt one or another 
set (cf. references cited by Heron, 1956). 

Demonstrations on a quasi-projective test 
of a situational effect in which powerful 
motives are roused by a stress related to the 
subject’s real life are not as easy to find as 
is simulated faking. The sole example is 
Heron’s study (1956) in which one-half of a 
group of applicants for a position took a 
battery of tests before they were hired and 
the other half afterwards. Systematic differ- 
ences between the responses of the two groups 
were related to the difference in set produced 
by the assumption on the part of the one 
group that the test mattered and on the part 
of the other that it was unimportant. 


1 Many individuals were responsible for the gather- 
ing of the data reported on here. The study in which 
the survey sample was gathered was carried on by 
the Research Division of Psychological Service of 
Pittsburgh under a grant from the Buhl Foundation 
and with the assistance of a number of industrial 
firms in Pittsburgh. The research director was 
Frederick I, Herzberg. Scoring of the Rosenzweig 
P-F tests was done by Eva Reinkraut and Edith 
E. Fleming. The organizational sample was 
by Reinkraut. The writer wishes to ackno 
gratefully the cooperation of Psychological Service 
of Pittsburgh in making the data accessible. Sta- 
tistical computations were carried out by Joseph 
Meiri and Elaine Sloan, both of the Graduate School 
of Public Health, University of Pittsburgh. The 
writer is fully responsible both for the statistical 
analyses and the discussion which comprise this 
report. The analyses were done under support from 
Grant M-2836 from the National Institutes of 
Health. 


drawn 
wledge 


The present paper reports the comparison 
of two sets of scores on the same test, Ep 
gathered as part of an activity unrelate M 
the other. This comparison, although it is Ei 
entirely free of possible ambiguities, ua 
some light on the way in which the ies » 
of this test were affected by the context 
which it was taken. 

The Rosenzweig Picture-Frustration Study 
was administered to 203 engineers and 2H 
countants as part of an investigation Fr ii 
their attitudes towards their jobs (Een d 
Mausner, & Snyderman, 1959). The tes e 
used to examine some of the subtle "s i- 
in personality which might affect an vidil 
vidual's reaction to his working come an 
Subjects were given the test brigis : di 
orientation as to the procedure to be fol e us 
and an envelope addressed to the qu the 
the research. The test was completed “while 
subject’s leisure and then mailed uel : 
there is no way of knowing where or whe 
work was done, it is likely that most p*- 
men filled out the booklets at their ber 
shortly after the orientation. Since the "P eas- 
gation of attitudes did not include any n" 
ures of contemporary feelings, but hr. 
stricted to stories about past periods in in A 
the subject was happy or unhappy qe m 
work, the attitudinal data give no ae e ade 
the subject’s frame of mind during t t 
ministration of the test. One can iac the 
the subject’s set varied randomly sinc ante 
data were gathered in 11 different d Je 
over a 6-month period. In the end 154 U 
protocols were returned. . 

Some question arose concerning bed 
priety of utilizing the results of so unort We 
a procedure. A partial check on the Wr oí 
the procedure was available since the 1 
the sponsoring organization include hom 
protocols from a number of men to V 
the P-F had been routinely given during i 
course of an appraisal procedure. It was Pine 
sible to compare the average scores i 
sample of men in the survey with the 2 


ro- 
odo% 


186 


Situational Effects on a Projective Test 


TABLE 1 
P-F 
ct. FOR SuRVEY SAMPLE OF ENGINEERS AND 
UNTANTS COMPARED TO PUBLISHED NORMS 
FOR MALES 
(ROSENZWEIG, 1949) 


Published Sample 

Scorin ublishe: Sample 

Category Norms Ww = 154) ; 
% M SD M SD W=») 
4 45 133 438 115 9 
A 28 83 290 67 1.35 
27 95 272 87 AT 
"o4 20 78 206 73 E 
el 53 113 506 104 216* 
NP or 289 9.7 195 


10.3 


N Sivas 
Significant at 05 level 


der} 
f» from these routine administrations. 
Score € dismay of the research team, the 
s from the survey sample differed widely 
Peed every scoring category from the 
ization's norms. However, a comparison 
lished. study sample’s scores with the pub- 
tion norms available for the general popula- 
Ca (Rosenzweig, 1949) showed a signifi- 
ifference on only one of the six scores 

(see Table 1). y 
aci tl rently, men in the survey sample re- 
Men Ne the test in much the same way 25 
king n the general population, but the same 


Fani; 9f man brought into a consulting Or- 
Tea for appraisal reacted differently. 
y ifference one 


wo to investigate the d 
tnde Beed Matched groups given the P-F 
Unti neutral conditions and under appraisal. 
Wag q such a study could be carried out, it 
te decided to take a closer look at the al- 
barin, available data. This was done by conr 
the survey population with a group of 

i ps for this purpose from the records 
™atche Organization. These groups were 
able ed as carefully as the information avail- 
Nyy PeFmitted, within the limits set by the 
et of cases in each population. 
Consy first step was to withdraw from the 
cos ting organization's files the test proto- 
titers, all men who fit the occupational 
IOWA for the survey. These men were engi- 
Toys, d accountants holding jobs above the 
© level of clerical or drafting work, but 


Org, 


187 


below the level of company officers. Many of 
them worked for the same companies as the 
men in the survey. These 60 protocols were 
rescored by the same team of psychologists 
which had scored the tests administered 
during the survey. 

A comment on the scoring scheme would be 
useful at this point. The P-F consists of 24 
cartoons, each depicting a potentially frus- 
trating situation. The protagonist in the 
cartoon is the frustrated figure. Coming from 
his mouth is an empty baloon. The subject is 
instructed to write into the balloon “what 
the man is saying.” These comments are 
categorized twice. In the first scheme, they 
are identified as extrapunitive (E), intra- 
punitive (I), or impunitive (M). Thus, a 
response indicating aggressive lashing out at 
the source of frustration is scored E; a re- 
sponse in which the frustration results in 
aggression turned inwardly is scored I; a 
response which attempts to side-step the issue 
of frustration and pretend to its nonexistence, 
is scored M. Each response is also, in the sec- 
ond and parallel scheme, described as showing 
one of three qualities of reaction to frustra- 
tion. The first of these is object dominance 
(O-D), which reflects a concern with the 
circumstances of the frustration. The second 
is ego-defensiveness (E-D) in which the em- 
phasis is on the wound to the subject’s self- 
esteem. The third category is need-persistence 
(N-P) which focuses on the necessity for 
solving the problems raised by the frustrating 

The Rosenzweig manual 
Fleming, & Clark, 1947) was 
used as a basis for scoring. Each test was 
scored independently by two psychologists. 
One of these, Edith Fleming, was a member 
of the original group which standardized the 
test. After considerable discussion, the two 
psychologists responsible for the scoring ar- 
rived at a consensus for the small number of 
items for which their independent scoring had 
diverged. 

In accordance with Rosenzweig’s original 
procedure, the scores are reported as percent- 
ages. Only tests in which at least 22 of the 
24 responses could be scored were included. 
The percentages are given with the total num- 
ber of scorable responses, 22, 23, and 24 
respectively, as a base. ; 


circumstances. 
( Rosenzweig, 


188 


TABLE 2 


CoMPARISON OF ORGANIZATIONAL AND SURVEY GROUP'S 


P-F on EacH or Six SCORING CATEGORIES 


Organizational ^ Survey 

Scoring (N = 60) (N = 154) 

Category ———————  —————— t 
% M SD M SD (df=) 
E 37.95 11.55 43.75 11.47 3.35%* 
T 28.65 7.48 29.04 6.70 0.38 
M 33.32 8.52 27.16 8.68 4.74** 
O-D 20.08 6.64 20.56 7.24 0.48 
E-D 47.47 10.98 50.63 10.39  2.00* 
N-P 3233 10.72 2897 9.74  223* 


* Significant at .05 level. 
** Significant at .01 level. 


Table 2 shows the comparison of the scores 
from the two samples matched only with 


respect to general occupation. Significant dif- 
ferences emerge in both areas in the scoring 


Bernard Mausner 


TABLE 3 
CHARACTERISTICS OF THE Two Gnoups OF ENGINEERS 
AND ACCOUNTANTS WHOSE RosENzwEIG P-F SCORES 
ARE COMPARED 


Organizational Survey 

Subjects (N = 60) (N = 154) 
Engineers 83 43 
Accountants 71 17 
Noncollege 79 9 
College 75 51 

Age 
Under 35 43 42 
Over 36 111 18 


scheme; the men who took the test as part 
of an appraisal showed significantly less = 
gression (E), and significantly more evaso" 
of the frustration (M); they showed sig 


TABLE 4 
COMPARISON OF PAIRS OF SAMPLES FROM ORGANIZATIONAL 
AND SURVEY GROUPS MATCHED ON ONE VARIABLE 
(Engineers vs. Accountants) 


Engineers Accountants — 
Organizational Survey t aie Oriza Survey t 
% Scores (V= 43) (N=8) 2e) (V=17) (N=) d=) 

M 36.8 44.6 40.8 42.8 

E 3,525 0.65 
SD 11.8 114 10.6 11.6 
M 29.5 29.0 26.5 29.1 

I 0.42 1.35 
SD 7.0 6.7 8.4 6.8 
M 33.6 26.5 32.6 28.0 

M 4.19** 2.00* 
SD 9.2 8.8 6.6 8.5 
M 19.2 20.5 22.2 20.7 

O-D 0.90 0.88 
SD 6.5 Tit 6.6 6.7 
M 47.6 50.0 47.1 51.3 

E-D 145 1.50 
SD 11.6 10.9 9.7 9.8 
M 33.1 29.6 30.5 28.3 

N-P 1.80 0.83 
SD 10.8 9.9 10.7 9.7 " 


* Significant at the 5% level. 
** Significant at the 1% level. 


Situational Effects on a Projective Test 189 


TABLE 5 
COMPARISON OF PAIRS OF SAMPLES FROM ORGANIZATIONAL 
AND SURVEY GROUPS MATCHED ON OxE VARIABLE 
(COLLEGE VS. NONCOLLEGE) 


= - 
College Noncollege 
O Seoras Organizational Survey t Organizational Survey t 
eae (es) w= d-9 W=9% (N19 (df) 
E M 374 43.5 41.0 44.0 
i 2.80** 0.76 
SD 11.8 11.7 10.5 114 
I M 29.1 28.9 25.9 29.1 
0.14 1.41 
SD 7.5 wl 74 6.5 
M M 33.4 27.5 33.0 26.8 
a 3,67** 2.2* 
SD 8.7 9.0 7.6 84 
M 19.8 20.1 21.6 21.0 
OD 0.21 0.22 
SD 6.8 6.9 3.0 7.6 
M 47.3 484 484 52.7 
ED 0.61 1.18 
SD 11.4 9.8 80 10.5 
M 32.8 31.6 29.8 26.5 
NP 0.68 0.97 
SD 113 90 68 93 


*g 
Signi 
*x eiBhifi ù 7 
Significant at the 5% level. 
cant at the 1% level. 


Can: 
Sj tly less 


nificantly 


atit 
dig, tunately for the acceptability 9 


ego-defensiveness (E-D) and 


ables were drawn and the test scores com- 


m i N-P). pared. 
di ius iro de The first treatment had little general sig- 


f these 
nificance since unknown factors may have 


leri, C68, a Jook hic charac- 
Magius ie affected the distribution of subjects along 


isti 
Sho cs of 


ay $t 


the two 


* xh aside from the fact t 
jong teers and accountants doing profes- 


samples (see Table 3) 


hat they are each of the continua for which information 


was available. For example, the men over 35 
in the organizational sample may have been 


Cach pp Ork, t matched for 
. ag he ae a deviant group within their age range since 


ito 

jon is the three variables on whi 

a f wer allable. The organizational 

qd Pw accountants, fewer noncolleg 
€ su, men past 35 than the sam 


Tvey, 


ch informa- 
sample 
e men, pro 


they were either looking for a job or up for 
motion or reassignment; the younger men 
ple in being considered for their first jobs might 
more closely approximate a representative 
sample of young engineers and accountants. 


Th Wo fy 
tthe ; nde: taken. : 
Sar le first rp gc ue idi was However, à brief summary of the results of 
; each of the two this analysis may be useful to investigators 


Whi Ineq 
d z thes 


Separate the degree to 

€ Scores T paa function who would be able to carry on systematic 

rabh; 

hs the jig characteristics of the 52! 
ten, cond, successive pairs of sam 


of the 
studies of the effects of age, education, and 


occupation on P-F scores. 
One general finding may be noted; no dif- 


mple. 
ples 


for each of the demographic vari- 


190 


ferences are found within the samples for the 
scores based on direction of response (E,I,M). 
The scores based on quality (O-D, E-D, 
N-P) do not differ when all engineers are 
compared to all accountants. But quality does 
seem to be affected by age. The trend in both 
samples is for need-persistent or problem 
solving responses to fall off with age. This is 
significant in the survey sample, which in- 
cludes a good many men over 35 (F — 5.15. 
df within = 151, between = 2, p < .01). N-P 
responses are also significantly less common 
among noncollege men in the survey sample 
than among college men (t= 3.27, n = 152, 
p< 01); there is also a significant increase 
in this group in the frequency of ego-defen- 
sive reactions (£ = 2.64, n = 152, p<.0l). 
For the organizational sample, only the ac- 
countants show a significant difference be- 
tween men over and those under 35. Here 


Bernard Mausner 


the tendency is for the older men to emphasize 
the frustrating situation in their "UM 
ie. to show higher O-D scores (t= 2.5 
n = 15, p< .05). 

The results of the second treatment may 
be found in Tables 4, 5, and 6. In this series, 
pairs of subsamples matched on one demo 
graphic variable were drawn from aa 
sample and the means of each of the H 
scores compared by means of a f test. As Te 
might expect from the variation within die 
groups described in the first treatment, ie 
quality of response to aggression is not p 
nificantly different for any of these $a 
parisons. The differences in quality led 
sponse (O-D, E-D, N-P) shown in Ta E. 
are, therefore, probably due to the p 
matching of the two major samples. The hr 
that the survey sample shows fewer EHE 
persistent responses and more ego-defen 


TABLE 6 


Comparison OF PAIRS OF SAMPLES FROM ORGANIZATIONAL 
AND SURVEY Groups MATCHED ON ONE VARIABLE 


(Age under 35 


vs. Age over 36) 


Age under 35 Age over 36 
Organizational Survey t Organizational Survey ij 
% Scores (N —42) (WV =43) (df=) (N= 18) qr-1p @=*) 

M 37.6 41.9 38.7 44.5 

E 1.52 a 
SD 11.5 14.0 12.1 10.4 
M 284 28.3 29.2 29.3 

I 0.06 0.08 
SD 12.1 7.9 9.5 6.2 
M 33.9 29.8 32.0 26.1 E 

M 2.00* 2.81" 
SD 9.0 9.6 7.4 8.1 
M 20.0 19.8 20.2 20.8 

O-D 0.12 0.33 
SD 6.6 6.7 6.8 7.4 
M 46.6 48.6 49.4 51.4 

E-D 0.88 on 
SD 111 94 10.7 10.7 
M 33.3 31.8 30.1 27.9 

N-P 0.66 0.84 
SD 11.0 9.2 10.1 10.2 


* Significant at the 5% level. 
*** Significant at the 1% level, 


Situational Effects on a Projective Test 


ies than the group of men who took 
cedure in the organization’s appraisal pro- 
former FE well be due to the presence in the 
ad E a sizeable body of men over 35 who 
có a college. Why such men should 
defensiv an imagined frustration with ego- 
rather n and object-dominated reactions 
t Sorbie an attention to the solution of 
at. em is obviously not indicated by the 

ong this research. 
respon Significant differences in direction of 
the fn to frustration found in Table 4 and 
nalysi ure to find such differences in the 
Wo rie of the internal characteristics of the 
situatio, Ups supports the contention that the 
tesponsib in which the test was given was 
from th le. No matter what groups are drawn 
ion of € two major samples the administra- 

ina the test during an appraisal results in 
ed M in the frequency of responses 
ege I. For three of the groups, engineers, 
Ya eat and men over 35, this is balanced 
extra, "nificant decrease in the frequency of 

Punitive responses. 


Scor 
coll; 


DISCUSSION 


oe are two possible explanations for 
the test inge. One is that the men who took 
ig, at during appraisal were, if not malinger- 
Pictur least trying to present as attractive a 
“Upres 45 possible, This would lead them to 
With 5 aggressive reactions and replace them 
Ist ind and meaningless evasions of the 
mitto ing circumstances portrayed in the 
hd that It would hardly be surprising to 
Dresa Some men are able to follow Whyte’s 
This "Pons for the testee (Whyte, 1956). 
Abl explanation implies that the men were 
hd m Penetrate the significance of the test 
atic P êNipulate their responses in a syster 
wn bean While there is evidence that this 
Nes orie for questionnaires (Heron, 1956; 
Piden 7, 1952) there has previously been no 
qq Ra that a projective device like the 

An ausceptible to such manipulation. 
ite aff ernate explanation is that the E 
i the tCted by the tensions of the appraisa 
n "eg way that McClelland’s subjects 
Bibi, led by his manipulation of needs for 
ione sent (McClelland, 1953). Certainly, 
Dre, der appraisal wants to control the 
Sion he makes: this could lead to the 

akes; this cou 


191 


evasive M response without the subject's try- 
ing to manipulate the results of the test in any 
systematic way simply through the general 
inhibition of behavior which betrays emotion. 
The fact that there is no increase in the 
socially valued problem solving (N-P) re- 
sponses, argues that the differences are due 
more to situational sets roused by the ap- 
praisal situation than to systematic faking. 

If the latter explanation is confirmed by 
further investigation, the P-F might be a 
useful instrument to measure situational sets. 
For example, the ratio of impunitive to 
extrapunitive responses could be used as a 
sensitive indicator of the degree to which a 
population felt under surveillance in a study 
in which it was important to obtain inde- 
pendent evidence of the salience of an ap- 
praisal of personality factors. Potentially, 
such uses of this test might turn out to be 
far more valid than its original application 
as a measure of ongoing personality charac- 


teristics. 
SUMMARY AND CONCLUSIONS 


The Rosenzweig Picture-Frustration Study 
was administered to 154 engineers and ac- 
countants. The test booklets of a comparable 
sample of 60 engineers and accountants who 
had taken the Rosenzweig as part of an 
appraisal procedure in a psychological con- 
sulting organization were scored in an equiva- 
Jent manner to those of the survey sample. 

Information concerning age and education 
was available for these subjects. When the 
variations in scores on the bases of these two 
variables were taken into account, it was 
found that there was a significant difference 
between survey and organizational samples 
only in one of the two scoring procedures. 

Individuals who took the Rosenzweig 
Picture-Frustration test as part of an assess- 
ment procedure showed a significantly higher 
tendency to give responses which were scored 
as impunitive than the group which took the 
test anonymously. That is, they tended to 
avoid the overt expression of hostility, and 
to substitute for it statements evading or 
denying the existence of frustration. 

Two explanations were suggested for this 
finding. One was that conscious faking on the 
part of people in the organizational sample 


192 


resulted in an attempt to present as favorable 
as possible a view of themselves. The other 
was that the dispositional sets roused by the 
appraisal situation succeeded, without the 
subjects awareness of the significance of 
his responses, in depressing expressions of 
emotion. 
REFERENCES 


Heron, A. The effects of real-life motivation on 
questionnaire response. J. appl. Psychol., 1956, 40, 
65-68. 

Herzserc, F., Mausner, B., & SNYDERMAN, BARBARA 


B. The motivation to work. New York: Wiley, 
1959, 


Bernard Mausner 


McCtetranp, D. C, Arkixsox, J. W, CLARK, R. Ay 
& LoweLL, E. L. The achievement motive. New 
York: Appleton-Century-Crofts, 1953. 

Rosenzweic, S. Revised norms for the adult je 
of the Rosenzweig Picture-Frustration Study. ot 
Louis: Author, 1949. * 

Rosenzweic, S, Freminc, Epitu E, & pe 
Heten J. Revised scoring manual for the Re 
zweig Picture-Frustration Study. J. Psychol, 1947, 
24, 165-208. ina 

Wesman, A. G. Faking personality test scores ru 
simulated employment situation. J. appl. Psycho 
1952, 36, 112-113. . New 

Wuyte, W. H., Jr. The organization man. 
York, Simon & Schuster, 1956. 


(Received July 26, 1960) 


Journal of Appli 
bplied Psychology 
I eS 52 Th 


THE LIMITING HAND SKIN TEMPERATURE FOR 
UNAFFECTED MANUAL PERFORMANCE 
IN THE COLD 


R. ERNEST CLARK 


Quartermaster. Research and Engineering Command 


E data of several studies suggest that 
Exposur performance is first affected by cold 
and i NEM between 55°F and 65°F 
ark & C temperature (e.g, Clark, 1959; 
usek ohen, 1960; Gaydos, 1958; Gaydos 
Suggestion 1958). However, this is only a 
esigned t since these investigations were 
e O suit other experimental interests. 
tablish Dose of the present study was to 
Perature the lower limit of hand skin tem- 
forma (HST) for unaffected manual per- 
nce, and to determine the stability of 


this y. 
en of temperature when duration of 
Dort, e is varied, On the basis of data re- 


ka by Gaydos (1958), and Gaydos and 
the (1958), 60°F HST was studied as 

i ibis limiting skin temperature for 
HST be performance in the cold, and 55°F 
Med wi the skin temperature initially associ- 


Severe cold affect. 


Tua METHODS 

v : . 

thoes, A White enlisted men, dressed in shorts and 

ite ve exposed to a constant ambient tempera- 
O°F and a relative humidity of ay. 

and cooling was accomplished by the 


S’s hands to 10°F air within à reírigera- 
and knot-tying 
Clark and 


€ ay, 
Poling perimental period lasted 4 days. Before 
hots gach day, S practiced tying five sets oF 5 
ated pe T was then raised to 90°F with: a 
We toolis and his hands inserted immediately into 
[7 x Mg box Experimental performance times 
ions during the 


Coon i 

hi ad On five different occas 
the Cooling ess: (a) upon entrance 9 
tera Dbropiia i (b) when his HST 
UN Minut, pete criterion temperature, 
Mines? L9 eNDOSure at criterion; (€ D 
Utes 6, Dosure at the criterion HST; (d) after * 
ya dte, Xposure; and finally, (e) after 60 minutes 
t D " 
M RN land 2 of the 4-day experimental session 
the (On HST for half of ‘the Ss was 53 Ban 
ion her half 60°F. On Days 3 and 4 d 
S at which performance was measur 
to exclude practice bias from 


"s hands into 
had fallen to 


&p 


Verse 


193 


, Natick, Massachusetts 


Although performance was always measured at the 
specific criterion temperature, HST was permitted 
to vary +4°F from the criterion during the 1-hour 
exposure period. Thus, when S’s HST had fallen 4°F 
below criterion, his hands were withdrawn from the 
cooling unit and were exposed to the 70°F ambient 
temperature. When his HST had risen 4°F above 
criterion, his hands were reinserted into the cooling 
box. It should be noted that the ranges of 55° 
+ 4°F and 60° +4°F actually overlap. 


RESULTS 


All scores were adjusted for initial, pre- 
experimental, performance level by subtract- 
ing S’s scores obtained at 90°F (HST) from 
each of his succeeding scores on a given test 
day. These deviation scores are shown in 
Figure 1 as joint functions of HST and 


exposure duration. 
Analyses of variance of the adjusted data 


7 
4 


zi — 


Am asrar 
"| e 


o 
pee A*F, HST 
-I e 


A PERFORMANGE-TIME FROM THAT AT 90* F HST (sec) 


o 20 40 So 
EXPOSURE TIME AT CRITERION HST (min) 


Fic. 1. Changes a) in manual performance as 
functions of hand skin temperature and duration 
of cold exposure. (Positive changes are decrements.) 


194 


indicated significant (p « .001) main effects 
of HST and duration of exposure, but no sig- 
nificant interactions between the experimental 
variables (f’s > .10). HSTs of 55°F were 
consistently associated with performance 
decrements and these decrements increased 
over exposure duration, becoming asymptotic 
after about 40 minutes of exposure. In con- 
trast, performance at 60°F HST was never 
significantly different from that at 90°F HST, 
even though duration of exposure influenced 
performance somewhat at this skin tempera- 
ture level. 

To determine the stability of these findings 
for different groups of Ss, the total subject 
sample was divided into two groups of six 
Ss and the data of each group were analyzed 
for replication differences. None were found. 
Essentially the same effects of HST and 
duration of exposure occurred in both halves 
of the subject sample. 


Discussion 


The present data suggest quite unequivo- 
cally that the HST at 60°F is not associated 
with performance hindrance due to cold 
exposure when tasks similar to the present 
one are used (tasks requiring much joint 
movement). In addition, critical performance 
decrements may be expected when HST falls 
S^F below this level, Le, to 55°F HST. 
These findings remained unaltered by ex- 
posure duration and were completely sup- 
ported by two samples of Ss, 

Presumably, some continuous function 
passing from no affect to severe exists be- 
tween the HSTs studied here, but the deter- 
mination of the function would be extremely 
difficult due to performance variability. Fur- 
thermore, a finer difference than 5°F between 
criterion HSTs would probably be unreason- 
able because of the need to use HST ranges 
to accomplish prolonged exposure periods. 

Considering the findings of Clark and 
Cohen (1960), it should be noted that the 
present data for performance at 55°F HST 
could have been achieved only with a 
“medium” rate of hand cooling, that is, the 
cooling rate normally associated with expos- 
ing bare hands to air temperatures around 
10°F. Very rapid hand cooling (exposure to, 
say, subzero air) could have permitted surface 
hand temperatures to drop to criterion levels 


R. Ernest Clark 


before internal hand temperatures had v 
sufficiently lowered to hinder performan 
Thus, the curve in Figure 1 for Bertone 
change at 55?F HST would have ps a 
the zero (no change) line, showing pertor i 
ance decrement only later in the hour. 
period. Very slow hand cooling dee 
20°F air or higher) could have negate! od 
apparent influence of the present Pr. 
exposure variable since internal han A 
peratures might have become asympto P 
fore performance was first tested at the pr 
HST criterion. In the latter case, the y 
HST curve in Figure 1 would have apparel 
as a straight line displaced above and ae a 
to the zero line, illustrating a constan B. 
formance decrement across the expo 
period. 

SUMMARY 


d 
The hands of 12 enlisted men were qu 
to 55°F and 60°F surface temperature g 
different experimental days. Perform A 
times to complete a standard knot-tying ched 
were obtained when S’s hands first A 
the appropriate hand skin tepare A 
20 minutes’ exposure at the criterion c 60 
ture, after 40 minutes' exposure, and a 
inutes’ exposure. el 
3 pes inii that performance was wi z 
hindered when hand skin temperature be t 
55°F, and that performance dee 
this skin temperature level were inc sures 
exponential functions of duration of p 
becoming asymptotic after about 40 e. of 
exposure. In contrast, performance à 
hand skin temperature remained una 
throughout the exposure period. 


ffecte 


REFERENCES 


Crank, R. E. The calculation of mea mance $ 
temperature in studies of manual perfor nment? 
the cold. PB Rep., 1959, No. 29. (EnvirOP pe- 
Protection Research Division, Quae Mass" 
search and Engineering Command, Natick: T 
chusetts ce * 

Crank, = E., & Comen, A. Manual perdona tem^ 
a function of rate of change in hand Aum 
perature. J. appl. Physiol., 1960, 15, 1. erfor 

Gavpos, H. F. Effect on complex manua Ping tbe 
ance of cooling the body while maintain siol 
hands at normal temperatures. J. appl- 

1958, 12, 373-376. ; jocali2ey 

Gavpos, H. F., & Dusex, E. R. Effects © anu? 
hand cooling versus total body cooling OF 1-380" 
performance. J. appl. Physiol, 1958, 12, 


(Received July 28, 1960) 


kin 
n hand $ in 


| 
| 


Journal of Appli 
bplied P: A 
1961, Val. 45 No. 3 Tes 200 


A COMPARISON OF ONE-, TWO-, AND THREE-MAN 


WORK UNITS UNDER 


VARIOUS CONDITIONS 


OF WORK LOAD' 


J. S. 


KIDD 


Ohio State University 


ie of the more persistent problems of 
uc is that of dealing effectively with 
*quipm ing work loads when there is a fixed 
in ES facility. Some degree of flexibility 
Possible it capacity has been thought to be 
mented 1f the operating crew could be aug- 
evidenc under peak load conditions. Recent 
small e from studies of the performance of 
(Eos or teams * (Kidd, 1958; Kinkade 
etsace, 1958: Moore & Anderson, 1954; 
the, 1956), however, has cast some doubt 
© efficiency of crew augmentation, per se, 
evice to increase system capacity. The 
in of these reports has been that if a 
pne ed task is distributed among more than 
'S disa rator, the gain in performance, if any, 
Person Pointingly slight. That is, if a single 
"oder Could handle a task adequately under 
ate input load conditions, adding one or 
i €lpers does far less than double or triple 
oad capacity, This result seems most 
Pitas when decision making activity 1s 
» aS opposed to more routine tasks. 
termin Purpose of the present study was to de- 
the "€ the effect of crew augmentation upon 
lag, Performance of the particularly complex 
tadar air traffic control. An additional 
Tation was the relationship between 
Performance and the personnel comp” 
9f the team, The question here was the 
Licteg to which performance could be i 
Vidua a the basis of some aspect of the indi- 
Situs Y measured performance of. the con 
yx members, It has been said that 2 
ù Avia Tescarch was carried out in the Laboratory 
ite, m Psychology and was supported by the 
vis; en Air Force under co No. 
RE. e kom pen reproduction, 
Dar, Publication, use, and disposal in ie d 
The jt by or for the United states Governmen- 
tam p to “unit” is employed in the title o T 
m» Avoid the anomaly of having & one-m: 


`N 


Onsen. 
Unita 


“Ons; de 
tea, 


Sitio 


ang ation 
2 n 


Or ; 
mally suitable to a single-man operation. 


group is more than the simple sum of its 
members (Warriner, 1956). Various proposi- 
tions have been advanced to give more precise 
meaning to this statement. For example, Sim- 
mel (Wolff, 1950, pp. 26-36) speculates that 
groups tend to perform at the level of the 
poorest member. Kidd (1958), in a different 
context, has suggested that the group might 
act to inhibit the activity of the poorer mem- 
ber and thus allow the better member to con- 
tribute proportionately more to the ultimate 
group output. Fiedler (1954) and others have 
suggested that the perceived similarity of 
member characteristics is among the deter- 
minants. Finally, there is the possibility that 
the simple average of the individual members 
may provide the best prediction. The present 
study attempted a partial evaluation of these 


alternative propositions. 


METHOD 


Apparatus, Task Setting, and Subjects. The gen- 
eral task environment was provided by the simula- 
tion within the laboratory of a radar landing-ap- 
proach control center. The simulation was imple- 
mented by the specially developed OSU Electronic 
Air Traffic Control Simulator (Hixson, Harter, War- 
ren, & Cowan, 1954). This device, which is built 
around an analog computer, is capable of generating 
up to 30 dynamic aircraft targets and presenting 
them realistically to the radar controller via a cathode 
ray tube display. Direct manipulation of the “air- 
craft" is accomplished by college students trained to 

` faithfully carry out pilot functions. In addition to 
the visual display of aircraft position available to 
the controller, he is in direct auditory communica- 
tion with the "pilots" under his jurisdiction through 
simulated radio channels. 

The task of the controller or control team is the 
guidance of the simulated aircraft through the pre- 
liminary phases of a landing approach. This involves 
the pickup and acknowledgment of aircraft entering 
a specified zone of responsibility, guidance of their 
flight course over à 50-mile approach route, altitude 
and airspeed adjustment prerequisite to actual land. 
ing, and positioning of the aircraft for acceptance by 
a subsequent control agency for the final phase i 


195 


196 


the landing process. The controller is also responsible 
for coordination of departure clearances. 

Tn the present study, the zone of responsibility was 
of constant extent. When a single operator was ac- 
tive, he was responsible for the total area. It was 
divided in equal segments when a two- or three-man 
team was employed. 

Nine laboratory trained controllers participated as 
subjects. All had approximately 6 months’ experi- 
ence in the control task at the time the study was 
initiated. 

Experimental Variables and Statistical Design, Two 
independent variables were evaluated in this experi- 
ment: the size of the control team or unit and the 
level of input load. Three different unit sizes were 
compared: a single-man operation, a two-man op- 
eration, and a three-man operation. Input rate per 
controller sampled at one aircraft arrival every 90 
seconds (on the average), one arrival every 60 
seconds, and one aircraft arrival every 30 seconds. 

The two variables were combined factorially, but 
only six of the nine possible combinations of condi- 
tions were actually tested. Table 1 illustrates the de- 
sign in graphic form. It was apparent that a com- 
plete test of all possible combinations was not mean- 
ingful nor even feasible within the context of the 
present experiment. Thus, conditions yielding redun- 
dant information were dropped as were those wherein 
the over-all system input rate would have exceeded 
the capacity of the system for sustained operation, 
The six remaining cells or combinations of conditions 


provided an Opportunity to determine most effec- 


tively the effects and coeffects of the two major 
variables. The desi 


ign is particularly advantageous in 
that it provided for a comparison of input rate con- 
ditions in terms of both input to the controller and 
input to the total system. Thus, across the center 
row, the input per controller Was constant (input 
load per controller is one aircraft every 60 Seconds) 
while the input to the total System varied directly 
as a function of the number of controllers. Along the 
intact diagonal oí Table 1, however, input to the 
total system is held constant while input per con- 
troller varies inversely with the number of controllers 
in the system. Interaction effects were subject to test 


TABLE 1 
FACTORIAL DESIGN OF EXPERIMENT 


Input Interval 


Size of Ci ] Uni 
errata ie ize of Control Unit 


in Seconds One-Man Two-Men Three-Men 
90 90/908 —t 90/30 
60 60/60 60/30 60/20 
30 30/30 — 


^ Number to left indicates input load on 
of input interval in seconds; 


J. S. Kidd 


by a comparison of four conditions: namely, n 
seconds/one controller, 60 seconds/one controller, 9 
seconds/three controllers, and 60 seconds/three Con- 
trollers. s 

Procedure. Each problem consisted of the arrival 
of 24 aircraft. Three of these were at an intermedi- 
ate stage of approach at the time the problem v 
begun. Problem duration was approximately 30 m. 
utes with a 10-minute interval between e 
Six problems were included in a single session. 
sessions were completed each week. 

Assignment of controllers to sessions an 
tions was accomplished in a cyclic program. vere 
steps in the development of the total program bs 
as follows: (a) each experimental condition appeal in 
once in each session and the order of conditions g 
the session was random; (b) the sessions Were E 
ranged in a series; (c) one controller was NU 
to each session by random selection; and d 
other controllers needed to fill out the two- ad- 
three-man team conditions were drawn from il 
jacent sessions so that, for example, the second wee 
in the two-man team was always the controller t 
signed to the next succeeding session. This pron rect 
was employed to minimize differential practice e 
and to insure maximum participant utilization. s 
major advantage of the method of scheduling in 
that during the total experiment each partici lon 
controller was active in each experimental Sese in 
and at each control position. Each condition W? 
cluded once in each session. 

Measures of Performance. Two types © 
measures were utilized in this experiment to evi 
the effect of the experimental variables: P 

1. The first major measurement category is per 
tem efficiency. It included mean percent de "1 and 
aircraft, mean fuel consumption per aircra s alit 
number of missed approaches. Mean delay peiin 
craft was calculated by determining the i ob- 
theoretical flight time, subtracting it from t raum 
served flight time, and dividing by the minum sr 
flight time. This delay score was then average’ 
all the aircraft in a problem, giving a mean P! 
delay score for each problem. Estimated em 
sumption in pounds was computed separate curves 
each aircraft on the basis of hypothetical Papë 
which took into account three factors: aircraft © 
airspeed, and altitude. tegory 

2. The second major system performance Cà d 
was safety. Separation errors were tallied during the 
problem by a monitor-observer stationed a e 
simulated control center. A separation error n 
fined as the approach of one aircraft withi ea 
seconds’ flight time of another aircraft. This 73,000 
that as much as 6-mile lateral distance or 6,0007 e- 
feet of altitude separation during descent W4 
quired in the control area. 

Tn addition to safety in the airspace of the d 
zone, runway suríace safety was also introduce jon 


d condi- 


f respons? 
aluate 


ns 


control 


" p ra 
a criterion of performance. A runway sP was 
error was scored if a 60-second safety interv ranc? 


not maintained subsequent to the departure clea! 


Åe- 


Work Teams an 


RESULTS 


ee recorded from a total of 54 
relatively he over-all performance of the 
lers fendi d ice, laboratory trained control- 
ments e an average of 45 aircraft move- 
landings a TT approximately 34 simulated 
this elim 11 departure clearances made up 
Movement his was equivalent to one aircraft 
he perf every 80 seconds, on the average. 
individu E of the nine controllers as 
est nas was relatively homogeneous. The 
across th roller had an average delay of 76% 
" rens three conditions of input load, while 

Sls 2 had an average delay of 9196. 
Ühis usc 3, and 4 summarize the findings 
Main ad nga with regard to the two 
of err As shown in Table 2, the effect 
Nee w, oad on a single controller's perform- 
of vlies quite pronounced over the range 
a far Sampled, using mean percent delay 
the crit n percent excess fuel consumption as 
n red, ria. In line with observations made 
Num ae studies in this series (Schipper; 
1956) " Kraft, & McGuire, 1956; Versace, 
accelerar, erformance decrement was sharply 
Ween aj ed when the average interval be- 
th carl drops below 60 seconds. While 
tend g er indices of performance in Table 2 
differen, Support the above relationship, the 
Were a between experimental conditions 
Ot statistically reliable, as determined 


d Work Loads 197 
by the nonparametric x,? test (Siegel, 1956). 
This lack of statistical reliability is at least 
partially attributable to the low over-all fre- 
quency of occurrence of a measurable event 
such as a missed approach. 

Table 3 presents a comparison of various 
sized control units where the input to the to- 
tal system was held constant. Since the result- 
ant input load per controller decreased pro- 
portionately as unit size was increased, a rea- 
sonable expectation would be a progressive 
improvement in performance with the larger 
teams. The observations in Table 3 failed to 
verify such an expectation. There was some 
small improvement as load per controller was 
moderated, but this change was not enough 
to allow the rejection of the null hypothesis. 

The data summarized in Table 4 indicate 
that when load per controller was held con- 
stant (resulting in an increasing total system 
load when control unit size is increased) per- 
formance was subject to a progressive decline 
with increasing unit (team) size, a decrement 
which was statistically reliable for the meas- 
ures of mean percent delay and fuel consump- 
tion. With the other indices in Table 4, these 
differences were not statistically significant. 
There was no observable interaction between 
control unit (team) size and input load. 

It is necessary to re-emphasize several pe- 
culiarities of the experimental design at this 


TABLE 2 
SS Tur EFFECT OF Input LOAD ON A SINGLE CONTROLLER 
Average Interval between 
Aircraft Arrivals 
Y Criteria 90 Seconds 60 Seconds 30 Seconds x28 
Me a 
N z 
m ^ Percent delay 40.1 75.6 129.6 ers 
a ) I 
& N percent excess fuel 52.0 78.2 146.8 137* 
arati 
i. OF err, " 
i ors per a ed: 7 
„borne per aircraft processi 031 o7 H T 
"gd ‘026 033 ‘064 e 
isse 
Me n ie per aircraft processed 027 020 038 35 
*pa M " 
Problem ttures permitted per 30-minute ia T - m" 
re 
ith 
‘the rated vad (Siegel, 1956). 
e reed: à R ~ was employed (Siegel, 1954. . " ] 
ab ciplatively eee, analysis of variance put Was Mis condition gel “ely limits the usefulness of the comparative statistics, 


S001, 


198 


J. S. Kidd 


TABLE 3 


Tue RELATIVE PERFORMANCE OF VARIOUS SIZED CONTROL UNITS WITH A 
Constant Input LOAD TO THE SYSTEM 


Criteria One-Man . Two-Man  "Three-Man Ui 
Mean percent delay 129.6 122.6 122.5 1.3 
Mean percent excess fuel 146.8 134.4 126.8 1.7 
Separation errors per aircraft processed: 02 
Airborne .110 .092 077 A 
Runway .064 .052 .051 = 
ih 
Missed approaches per aircraft processed .038 .033 .043 

Mean departures permitted per 30-minute problem 5.2 S3 5.4 0.1 


® The Krushal-Wallis one-way analysis of variance test was employed (Siegel, 1956). rative statistic: 
b The relatively high frequency of zero cell entries for this condition severely limits the usefulness of the compa! 


point to help in the evaluation of the results. 
As has been indicated previously, it was not 
possible to vary team size without the occur- 
rence of a concomitant shift in input load 
characteristics either with regard to the sys- 
tem or to the individual operators. The solu- 
tion of this dilemma employed by the present 
study was to explore both major alternatives, 
constant load per system/varying load per 
controller, and varying load per system/con- 
stant load per controller. Even with the use 
of this technique, however, there remains the 
inevitable confounding, and any conclusions 
must be qualified by this fact, 

The predictability of team performance on 
the basis of individual scores was assessed 
using four predictor variables. The variables 


TABLE 4 


CE OF Various SIZED CONTROL UNITS WITH A 


Tue RELATIVE PERFORMAN 
Constant Input I 


LOAD TO THE CONTROLLER 


used were the average score of the constitue 
members, the better (best) member's SC the 
the poorer (poorest) member's score, an om 
variability (range) of the constituent aA 
bers’ individual scores. No reliable predic 
of the team score was obtained. oin: 
A final consideration was controller | 
munication activity. Each problem was time 
pled for the proportion of the controller E the 
spent in communication with pilots —-—t 
proportion of his time utilized in com ob- 
cation with fellow controller(s). It was ^ 
served that in the three-man teams ation 
30% of the center controller’s commun a 
time was spent talking to the other bw con" 
These proportions represent only a sing! xt pe 
troller’s talk time to other members 9 


Criteria 


One-Man — Two-Man 
20.4* 
Mean percent delay 75.6 122.6 132.6 $ 
.0 
Mean percent excess fuel 78.2 134.4 146.0 19. 
Separation errors per aircraft processed: 14 
Airborne 077 092 .063 T 
Runway .033 052 .063 is 
b 
Missed approaches per aircraft processed .020 .033 .042 9 
Mean departures permitted per 30-minute problem 3.1 5.3 5.9 E 


a The Krushal-Wallis one-wa: 
b The relatively high freque: 
** p <.001. 


i 7 stie 
y analysis of variance test was employed (Siegel, 1956). ive stati? 
ncy of zero cell entries for this condition severely limits the usefulness of the comparat 


= m | 
| 


Work Teams and Work Loads 


Den fede: team and do not include either 
ler’s P pi the reply (ie., the control- 
Sit put" only was included) or com- 
ancill ons to or from those who played 

ary roles in the control center simulation. 


Discussion 


on — of this experiment indicate that 
improve dh augmentation does not necessarily 
Chine syst e capacity of a complex man-ma- 
ing Pa em. These findings fit within a grow- 
ance thy of data on group and team perform- 
of task SW covers a considerable variety 
Btou imensions. The basic finding that 

-D Productivity is proportionately inferior 


o indi 
ttadicted. 1 productivity has not been con- 


tringj 
Ment 
Derim 


Dossih 
bene le also that there is a motivationally 


tity TA by-product of integrated group aC 
(Bavel at leads to continuity of group effort 
Practical’ 1953). This factor can have great 
volun al significance in those instances where 


e t to a grou 
ant aoe the Docs rate 0 
iS is i Ot increased proportionately as team 
leve pp Cteased, there is good reason to be- 
ilit at the larger units do have an increased 
ce m fulfill a buffer storage function in 
init em. Previous studies in this series 
ND h Hooper, 1958; Kidd & Kinkade, 
sur ave shown that the most sensitive 
5 conte load is the number of aircraft un- 
id rather than input rate, pet Se- 
at, 9 m ata from the present study dive 
omen PPipulation of input rate; Ep 
patro» fer ding “number of aircr 

"uj Oa] le S warranted. However; 


199 


units for short-term system effectiveness is in 
the increased temporary storage capacity that 
they provide. 

Nevertheless, it seems clear that a system 
design and system management which mini- 
mizes interoperator coordination and integra- 
tion demands will yield superior performance. 
It now becomes the problem of future investi- 
gations to determine precise techniques for 
work load allocation that lead to maximum 
operator autonomy but do not necessarily ex- 
clude all opportunities for operator interac- 
tion with its potential benefit to morale. 

The somewhat paradoxical inferiority of the 
group as compared to the individual raises 
again the issue of the contribution of each 
constituent member to the team. The results 
of the attempted prediction of group from in- 
dividual performance undertaken in this ex- 
periment were negative and therefore must re- 
main inconclusive. It is quite apparent, how- 
ever, that the disappointingly low output of 
the group is not due simply to limits imposed 
by the poorest member. 


SUMMARY 


In this study a comparative evaluation was 
made of the effect of input load and team size 
on the productivity of a radar approach con- 
trol unit. The context was a simulated radar 
approach control center, and the task as- 
signed was that of pattern-feeder controller, 

Nine laboratory trained controllers partici- 
pated in a total of 54 problems. Input load 
was varied by spacing the interval between 
aircraft arrivals at either 90 seconds, 60 sec- 
onds, or 30 seconds, on the average. Control 
unit size was varied by using one, two, and 
three operators per unit. 

Results regarding input load confirmed 
previous findings that performance falls off 
sharply as load is inceased. When input load 
to the system was held constant and the con- 
trol unit size was increased, leading to a de- 
crease in load per controller, performance was 
upgraded only moderately. When input load 
to the system was increased proportionately to 
the increase in team size, resulting in a con- 
stant load per controller across conditions, 

erformance was markedly diminished in the 


multiman units. E 
No reliable prediction of team performance 


200 


was observed on the basis of four predictor 
variables derived from individual performance 
indices. 

The tentative conclusions were that maxi- 
mum performance can be attained from multi- 
man system operations when the coordination 
demands are minimized. A reservation im- 
posed on this conclusion was suggested from 
other cited research which has indicated that 
complete functional isolation or autonomy 
may have deleterious motivational effects in 
the long run. 


REFERENCES 


Bavetas, A. Communications patterns in task-ori- 
ented groups. In D. Cartwright & A. Zander 
(Eds.), Group dynamics. Evanston, Ill.: Row, 
Peterson, 1953. Pp. 493—506. 

Firpter, F. E. Assumed similarity measures as pre- 
dictors of team effectiveness. J. abnorm. soc. Psy- 
chol., 1954, 49, 381-388. 

Hixson, W. C., Harter, G., A., Warren, C. E, & 
Cowan, J. D. An electronic radar target simulator 
for air traffic control studies. USAF WADC tech, 
Rep., 1954, No. 54-569, 

Kipp, J. S. Social influence phenomena in a task-ori- 
ented group situation. J. abnorm. soc. Psychol., 
1958, 56, 13-17. 

Kip, J. S., & Hooper, J. J. Division of responsibility 
between two controllers and load balancing in a 
radar approach control team: A study of human 


2 


J. S. Kidd 


tem effectiveness as a function of the division E 
responsibility between pilots and ground conna 
lers: A study in human engineering aspects 
radar air traffic control. USAF WADC tech. Reb» 
1958, No. 58-113. size 
Kixxapz, R. G., & Kipp, J. S. The effect of team 
and communication availability on decision-ma No. 
performance. USAF WADC tech. Rep, 1958, 
58-474. 


pior in 
t AxpERsON, S. B. Search behavior. 
Moore, O. K., & ANDERSON, S wien 


individual and group problem solving. A 
Rev., 1954, 19, 702-714. 

SciipPrR, L. M., Versace, J, Krart, C. Lo 
Gure, J. C. Human engineering aspects 9 
air traffic control: IV. A comparison 0 
in-line control procedures. USAF WAD 
Rep., 1956, No. 56-69. 

SieceL, S. Nonparametric statistics for the b 


sciences. New York: McGraw-Hill, et commun 


& Mc 


ehiaviort al 


fn 

Versace, J. The effect of emergencies anc tes: 7 
cations availability with differing entry e aif 
study in human engineering aspects of M eG NO: 
traffic control, USAF WADC tech. Rep. 195% 
56-70. mation 

Warriner, C. K. Groups are real: A reaffirmat 
Amer. sociol. Rev., 1956, 21, 549-554. 

Wotrr, K. H. The sociology of Georg 5 
coe, Ill.: Free Press, 1950. 


n- 
immel. Gle 


engineering aspects of radar air traffic control. 
USAF WADC tech. Rep., 1958, No. 58-473. T. 
Kipp, J. S., & Krxkapg, R. G. Air traffic control sY 

| 
(Received August 5, 1960) 


Journal of Applied Psychology 


Vou. 45, No. 4 


AvGusT 1961 


EMOTIONAL DISRUPTION AND INDUSTRIAL 
PRODUCTIVITY * 


STANLEY SCHACHTER 

Columbia University 

LEON FESTINGER 

Stanford University 
The ch, 
‘ited py 
jethods of 
5 “quent ey, 
oothness 


He WOH 
aria) 
i ble 


anges in working procedures re- 
the introduction of new models or 
construction are regular and quite 
ents in many industries, but the 
SS of the required transition from 
king procedure to another is highly 
Miroquc: Occasionally, rebalancing a line or 
With n ng new methods of work proceeds 
eve, 2° difficulty. In a matter of hours oF 
"uS Wale a work group may be operating 
Prio, "aed and productively as immediately 
'oducri itt change. More often, however, 
“Cipito ity or quality of work may drop 
Wire q Usly after a change and it may Te 
the Pid Or weeks for a work group to reach 
e pe quality and production goals. 
lig. ,^Ctors affecting the smoothness of 
tible, x of transition are undoubtedly mul- 
j Ngineering and planning practices play 
T role, and supervisory practices un- 
Y have an impact. Finally, there is 
leg, 8 evidence that psychological factors 
Yorkers € receptivity or resistance of the 
etch to a change. The purpose of the re- 
iet to be described was to examine the 


ing. Of some of these psychological factors 


Xperj 3 : 
“imental situations where it 15 possible 
Behavioral 


hese ș 
zi 1 studies were sponsored by the 
Authors Vice of the General Electric Company. 
itecto ne particularly grateful to «bs id 
jer in ates the Behavioral Research Service, À 
“iments itiating, planning, and carrying out these 
e We wish also to express OUY apprecia- 
Managers and other operating personnel 
© General Electric departments in which 
Were conducted for their cooperation 
€ execution of each of these experiments. 
201 


i 
ereas 


e D i 
‘Ag 
es of 


nae ; 
nq ha ies 


BEN WILLERMAN 


University of Minnesota 


asp RAY HYMAN 


University of Oregon 


to rule out or control the possibly confounding 
effects of engineering, planning, and super- 
visory practices. 

For assembly line operators the motor skill 
difficulties involved in most changeovers ap- 
pear to be minimal since each job consists of 
a few basic movements repeated over and 
over again. Most rebalancings, insofar as they 
affect the individual operator, involve such 
minor modifications of a basically simple pat- 
tern of work that only a few minutes or 
hours of practice should restore the operator 
to his former level of skill. Yet weeks some- 
times go by before an operator returns to an 
el of quality and output. Since 
the difficulty of relearning alone cannot be 
considered an adequate explanation, it would 
seem that emotional and motivational factors 
may be heavily involved.” 

It is commonly accepted that emotional dis- 
turbance can seriously interfere with the per- 
formance of particular kinds of tasks. Such 
states as fear, rage, and hostility may se- 
riously disrupt the kind of coordination re- 
quired in the performance of an assembly 
operation. The question immediately arises, 
however, as to whether such emotional states 
are equally disruptive of the performance of 
all manual tasks of this sort or whether the 
degree of disruption varies with the nature of 
the task being performed. As a first step in 
consideration of this question, let us examine 
a dimension of motor activity which we shall 


acceptable lev 


Pampa " H 
2For a full discussion of group effects on work- 
motivation to accept or resist a change, se 
e 
French (1948). in: 


ers’ 


Coch and 


202 


call stereotopy. By stereotopy we refer to the 
extent to which behavior has an automatized 
or habitual character, requiring neither con- 
centration, attention, or thought. Examples of 
thoroughly stereotyped behaviors in adults 
would be walking, eating, dressing, and the 
like. It is perfectly possible to carry on such 
activities while paying virtually no attention 
to the actions involved. 

lt is suggested that, once mastered, the 
typical industrial assembly operation is of 
precisely this stereotyped character.’ It is an 
operation which may be repeated thousands 
of times daily and over the course of time 
undoubtedly becomes a thoroughly automa- 
tized pattern of motor activity. Consider now 
the effects of introducing a change in work 
procedure on this pattern of stereotyped be- 
havior. From a completely habitual set of 
actions, the job once more requires the op- 
erator's attention. Probably even a trivial 
modification will require the operator's con- 
centration; and, where previous to the intro- 
duction of a new work procedure an operator 
may have performed his job with dispatch 
and virtual automaticity, the change requires 


2 If this characterization of the assembly process is 
correct, it facilitates a resolution of what has so 
long seemed a paradox—the fact that though an 
assembly job is precisely the kind of work that is 
so often considered tedious, demoralizing, and un- 
supportable, even casual observation of such a factory 
operation makes it apparent that this is simply not 
the case. The operators on such a line are usually 
relaxed and at ease, they converse casually and joke 
freely with one another, all the while doing their 
job. Interviews conducted with assembly workers in- 
dicate a relatively high degree of job satisfaction. 

Experimental studies of satiation provide some de- 
gree of insight into the process which makes this 
repetitious, automatic sort of job a bearable and 
perhaps even a pleasant one. In such experiments, 
subjects are required to work continuously at a 
repetitive task and their rate of boredom, fatigue, 
and satiation measured. Karsten (1928) demonstrated 
experimentally that when their task is an important 
one requiring the subjects to concentrate upon their 
activity, satiation sets in much more rapidly than 
when the task is a peripheral one to which the 
subject pays little attention, This would suggest that 
an assembly operation is bearable because it quickly 
becomes a peripheral, almost. automatized or stereo- 
typed pattern of behavior like walking. And the 
observation that operators are able to perform their 
jobs while talking or daydreaming would certainly 
support this characterization of the assembly opera- 
tion as stereotyped behavior, 


Schachter, Willerman, Festinger, and Hyman 


" d | 
that he thinks about and be constantly ws 
of his movements. The effect of introd ws 
a change, then, is to convert a stereo 


pattern of behavior into one which red 


concentration, for a time at least, if the a 
ity is to be successfully carried out. 
Let us now consider the effects 0 1 
of emotional disturbance on behaviors of s. 
ing degrees of stereotopy. It is suggest iy. 
our major hypothesis that emotional dis 


r 
Hi i A f moto 
j maximally disruptive © 3 
ance will be y Mes 


f states 


behaviors requiring thought and er 
tion and minimally disruptive of thoroug 


‘o jnto 
stereo i ing this } t 
A avlors. ranslating t ^ 
t ty ped behaviors o i tri 


terms directly applicable to the gest? 


problem under consideration, it is SUE 
that: 
1. When assembly operators 31 
ing their tasks in a stereotyped manne 
no procedural changes are underway! b 
quality and quantity of production Ms 
little affected by considerable variation 
emotional states of the operators. 
2. When changes in working proce AS 
introduced, emotionally disturbed king the 
«ture 
duce 


are perform. 


are 
dures jl 


d 
d 


have considerably more difficulty ™ 
transition than will relatively idm 
operators and this will be reflected in T 
productivity. 


PROCEDURE g 


the’ 
The basic experiment conducted to uc 
hypotheses is, in essence, a simple one- their p" j 
assembly line groups, long experienced at a chose 
and performing identical operations be sev 
as experimental groups. Over a period ups wr | 
weeks, the experimentally “Disfavored’ gro commo, 
Systematically subjected to a series e E 
annoyances while the work situation ore ple so 
perimentally “Favored” groups was made nit aUo, 
as possible. Following this period of MAM o^ | 
of emotional state, an identical changeover y grob 
procedure was introduced in the expeime ms 
Throughout the course of the study deta! 1 " 
were kept of the daily production of à the “y 
groups. Comparison of the productivity the efiet 
perimental groups allows evaluation of pod 


of normal operation when work is stereotYP 
during periods of changeover when wo" je 
close attention. d was sey 

Though the design of the experiment eria p 
its realization was not. The sheer man Sy efft y 
plexity of training and coordinating the was n 
of the manipulators and data collectors. in T o 
that it was simply not feasible to work tim 
than two pairs of experimental groups ? 


H ini 
of emotional state on workmanship during ^ a 
qu 


Sin 

Ee number of groups is too small to allow 
3 erable confidence in experimental differ- 

basic experiment was replicated three 

results of these three independent ex- 

€ presented in this paper. 


1. Th x 
€ first experiment was conducted in the Home 


Laung 
E d " 
ark d Department of General Electric's Appliance 


called To. 


YWensh experiment was conducted in the 
ing 4,90 Tube Plant of General Electric’s Receiv- 


"be Department, This study will be called 


Owensboro 


-A thi : 
H me a experiment was conducted again in the 
Wil he *Undry Department of Appliance Park and 
(o called Louisville II. 


Tho 
ie the details of each study necessarily vary 
denti ery to factory the basic procedure is 
b Ris the three experiments. We shall, in this 
Scribe the essential similarities. 


Expos 
werimenta] Groups 
,in 
figa ol periments, the experimental groups CON- 
m enl assembly groups varying in size from 
pab] in ers, All groups were of the classic as- 
qm Work Pattern arranged so that a piece passed 
e line as er to worker emerging at the end o 
thee Worker finished or semifinished product. Though 
d Same rs job is different, he repeats precisely 
Works ^ ye ae oe on each piece on * 
a In all of the experimental groups, 
Such ‘ons Mvolved only pa manual movements 


| * Wire parts assembly, fine welding, connecting 

ide’ cach 2 lead, and the like. 
(jg e l asso periment, the comparison groups Were 
Wes Sembly lines performing the same opera- 
dis m, tehey ins the same product. Such groups 
ma tiVity, ed as closely as possible in age, pro- 
ln if With perience on the line, length of employ- 
Mell cases the company, and disciplinary record. 
cia] gic’ attempts were made to choose experi- 
records indi- 


led that Ps whose productivity 
t kr had been working at e 
Me he exp, Veral weeks prior to the beginning 
tator Periment and where a majority of the 
iow each line had been at their particular 
hy tepeateg E than 6 months. Such workers, Hes 

Usa their basic operations hundreds © 
times, 


a stable rate 


D 
"“bularig 
It n of Emotional State 
M 
th, Set 
i3 
bar. Set as 


the intent of the manipulations tO make 
pooni disturbed and upset and ex 
ers ADpy as possible. In each experimen © 
Begg isory Cooperation of local engineering om 
pun a Personnel was enlisted in devis, an 
WePtoqi A Stream of skits and manuevers de 
fän Ostile and irritated states of mm d s 
Othe, States of contentment and satisfac se, 
he specific manipulations, of course 


Emotional Disruption and Industrial Productivity 


203 


varied from experimental locale to locale depending 
upon the nature of the job and upon local conditions. 
To eliminate any aspect of artificiality, all the 
manipulations used were characteristic of and normal 
to the working life of each factory. However, they 
were intense and concentrated in a relatively short 
period of time. 

The procedures employed in the Louisville I study 
jllustrate the general nature of these manipulations. 
For the Disfavored group over a 3-week period of 
manipulation, at least one annoying incident oc- 
curred on 11 of the 15 working days. These incidents 
centered around two themes: (a) a threatening and 
persistent time-motion study of the operators on 
the Disfavored line, (b) a persistent attack by 
control and supervisory personnel on the 
quality of the workmanship on this line. In ad- 
dition, during this period of manipulation, a series 
oi irritating incidents were precipitated or aggra- 
vated. For example, when washers were in short 
supply, the Disfavored operators were forced to sort 
out washers from a collection of greasy and dirty 
parts. 

For the Favored groups, the chief techniques 
employed were praise and flattery of the high qual- 
ity of work on this line by engineering and 
supervisory personnel. In addition to praise, when- 
ever management personnel were in the area, they 
went out of their way to be friendly and helpful 
and to give credit for suggestions. Finally, a de- 
liberate and constant effort was made to prevent 
the occurrence of any irritating events so that for 
the 3 weeks of manipulations, life, for these groups, 
was relatively smooth, untroubled, and pleasant. 

In all three studies, the strategy of the manipula- 
tions was similar: maintain a continuous and per- 
sistent nagging at the Disfavored groups throughout 
the course of the manipulation period but concen- 
trate a flurry of annoying manipulations at the very 
end of the manipulation period so as to have the 
Disfavored groups as disrupted and angry as pos- 
sible on the day of changeover.t The manipulations 
proper ceased completely the day the changeover 
started. Though ideally one would want to maintain 
maximum differences in emotional state between 
Favored and Disfavored groups all through the 
changeover period, we feared that. inadvertently a 
manipulation of emotional state might artifactually 
affect productivity. Rather than take this chance 
we deliberately decided to end all manipulations of 
emotional state with the beginning of the change- 
over period. This feature of the experimental design 
does mean, however, that the emotional differences 
between the Favored and Disfavored groups prob- 
ably decreased during the course of the changeover 


period. 
— —— B 
4'This same factor dictated that the changeover, 


in all studies, take place on a day in the middle of 
the work week. It seemed a reasonable hunch, that to 
allow à weekend to intervene between the final 
manipulations and the first day of changeover could 
only attenuate the impact of the manipulations, 


quality- 


204 


After each of these manipulated incidents, the 
personnel involved dictated a detailed description 
of what they had done and what reactions they 
had evoked. In addition, the supervisory personnel 
involved in these studies kept a record of remarks 
and incidents relevant to the manipulations. This ma- 
terial is used to evaluate the effectiveness of the ma- 
nipulations. 


The Changeover 


Following the weeks of manipulation of emo- 
tional state, an identical change in work procedure 
was introduced for each pair of matched assembly 
groups. As far as the workers were concerned, these 
changes were coordinated to the production of a 
new product or the rebalancing of a line in order 
to produce a different number of units per day. 
A few days prior to the changeover, the Operators 
were informed of the impending changeover and 
the reasons for it. For the first few hours of the 
day of changeover, the foreman and a trainer 
worked closely with the Operators showing them 


the new procedures and then left the operators 
Pretty much on their own, 


) For most of 
jobs were Changed i 


Owensboro stud 


from the assembly of one vacuum tube type to 
another type. Befor 


e op- 

erators had to agus 
> 

a blank 


but in most c 
minor changes in procedure. 


Measures 


The chief data collected in all three studies were, 
of course, data o i 


; n the quantity and quality of 
Production of each group. As standard factory pro- 
-hour record is kept of the 


Schachter, Willerman, Festinger, and Hyman 


i r ere 

In the Louisville II study, special SERAIS M 
trained to inspect all of the units produced pur 
experimental groups. Again these operators ir M 
daily systematic record of all defective units @ 
their cause. ] on 

In the Owensboro study, all tubes weeds 
each line were tested on a machine ue ra hi 
defective tubes and automatically tabulate 
nature of the defect. tion 

In addition to these basic data on venter. 
other data were collected by means of ol we ‘he 
and interviewing of the workers in some: n 
experimental groups. Where it is relevant, b ^r tis 
niques used will be presented in the body 
report. 


Sequence of Events 


d 
In all three studies a common prone arl- 
sequence of events was followed. After the ent 0 
menters, in collaboration with the manane enel 
the plant involved, had selected the ds were 
Eroups, the workers in each of these group of the 
called together for a special meeting with one study 
experimenters. The operators were told that i ie 
would be conducted with several groups - p 
plant. The explanation of the presumed Pr riment 
the study varied from experiment to Gia as 9 
but, in general, the studies were explained fane? 
examination of factors affecting the main show! 
of a good production rate. Such a study would 
be of value to the operators because what s up 
be learned should help to keep average amni jn 
and useful to the company for it should asin’ 
Preventing delays and in meeting the pu s 
competition from various sources. The on s? 
were, of course, told nothing of the true a 
of the study, The purpose of this meeting esent 
Provide a rationale for the fairly constant pr m 
of the experimenter or his staff on the factor ctio” 
and for the introduction of the new data colle 
procedures, rema” 
A separate meeting was held with the fo es? 
and supervisors of each of these groups: orde 
people were told about the study in detail in inte" 
to insure that they would not inadvertently tc i 
fere- with the manipulations or affect the ae ol 
of the experiment. At this meeting, the rd ned 
these supervisory personnel were explicitly a n 
for them, They were asked to be in the zi i 
mental work areas only when the normal ce i 
ments of work demanded their presence, ot 
to stay away from these work arcas. nip” 
Following these orientation meetings a pre™ jns 
lation period of approximately 2 weeks was wer 
tuted. During this period, production p as. 
systematically gathered to provide a norme oi E 
line. At the end of this period the manipula e e 
emotional state began and, depending on 
periment, lasted from 2 to 4 weeks. At the €. gh? 
this time, the manipulations were halted a e 
changeover introduced. Again depending on HY jo 
periment, postchangeover data were collect? 


Emotional Disruption and Industrial Productivity 


eri + 
fee ranging from 1 to 4 weeks at which point 
experiment was terminated. 


Apologia 


a $ noto. cach of these studies was consider- 
Methodolo ege sii than in realization. Such 
fects of TM _ RICHES MIS counterbalancing the ef- 
interviewin Ki and night shift operations, repeated 
ike Were tA of the experimental operators, and the 
esign of arelully incorporated into the preliminary 
Blonde n study. | However, the problems of 
Mental siia to keep tight control of the experi- 
il oper. ation in the hubbub of a major indus- 
these Mean forced us to abandon many of 
at an eties if we were to have any experiment 
O; 
sort arg nderway, extended field experiments of this 
9f the eim Marly vulnerable to disruption. Each 
Vital nH ee groups, though small, was a 
th Gi of an extensive industrial operation. In 
“Wolveq awe I study, for example, the 22 workers 
In the experimental groups produced a 
Unit which was an essential part of the 
1,700 col- 
th be "lY serious interference with the work of 
the i rimental groups could, therefore, disrupt 
Adhere S factory, This fact made it necessary to 
© a rigid and fixed schedule of experimental 
preparing for the changeover, for 
yas necessary to plan weeks in advance 
à dE sufficient storage of control units to 
Shoulq [re the factory could continue im opera- 
i the experimental groups fail to produce 
to Ep on changeover, Such factors made 
cen etely i of major experimental events almost 
nartible lo exible and, therefore, unhappily sus- 
baa NL, Fo ance factors which could ruin the ex- 
da Mental Ea example, should a few of the ex- 
a 9f the ee ors be home sick on the crucial 
tat COVer 5 hangeover nothing could be done—the 
Nos Would 4 not be postponed, replacement 0p- 
oye be rui be put on the line, and the experiment 
With su med, "hese experiments were carried 
qii all mer cooperation from management but 
ing ch Studies they are still subject to the 
Ons, ance factors characteristic of real life 


Butar; 


Sched 


Wan’ One yn à 
D Bag, Y or another each of the three studies 
"Dig E. by difficulties of this sort. The major 
"Duy Was Were the following: The Owensboro 
tche originally planned with four pairs of 


by, ori 
ij. Plane Mental groups. On the day of change- 
ON a Sci for two of these pairs Kentucky 
tor, Absentery severe snowstorm. There was suffi- 
Shang Or "eism so that none of these groups 
t ever the second and third day following 
cw absenteeism was still great and some 


tath qZOUPS worked while others did not. BY the 
‘yy Semen, all groups were working put enough 
bar, ator were involved to still make 
US to ae impossible. This debacle, of course, 

discard the data of these four groups- 


205 


It should be noted, though, that the little data 
that do exist for these groups support the hypotheses 
of the study. 

A difficulty of another sort interfered with the 
Louisville I study. Originally this study was planned 
for two pairs of matched work groups. Midway 
through the study, it was discovered that one 
of the Favored groups rotated jobs not only with 
one another but with members of other lines—a 
detail which made the productivity data collected 
on this group worthless. Although it might have 
been possible to have the foreman order the men 
to stick to their jobs, this could hardly have been 
considered an ideal event for a group that was 
presumably being favored and it was decided to 
drop this group and, of necessity, the Disfavored 
group with which it was matched, from the ex- 


periment. 


RESULTS 


Louisville I 


In the first study, the experimental groups 
consisted of two small assembly line whose 
job was the assembly of control units for 
washing machines. One of these groups worked 
on the day shift and the other on night shift. 
On each shift these were identically set-up 
line operations consisting of 11 operators, all 
but one of whom were women. The first five 
operators on these lines were involved en- 
tirely in assembly work, the following two 
operators in line were involved in assembly 
and testing operations, and the last four op- 
erators in testing, checking, and repair work. 
workers in each line are 


Only the first seven v 
relevant to our study for these operators did 


all of the assembly work. 

Each of these lines was a paced-conveyor 
operation—that is, a conveyor belt traveling 
at a fixed speed, over which the operators 
had no control, carried the work mounts to 
each worker and the operator was forced to 
work at the pace set by this conveyor belt. 

The general nature of the manipulations 
of emotional state in this first study has been 
described in the preceding section. There is 
every indication that these manipulations 
were successful in making the Disfavored 
group (the day shift line) hostile, upset, 
and angry while the Favored group (the night 
shift line) remained relatively tranquil. The 
protocol dictated by the manipulators and 
supervisors of these people indicates again 

in that the Disfavored subjects were 


and agal s 
disturbed by the manipulation. Angry com- 


206 


ments such as “Somebody is trying to cut our 
throats” were common. | 

After exactly 3 weeks of manipulations, the 
changeover was introduced by means of re- 
balancing these lines. Before the change, each 
line was scheduled to produce 1,174 units 
per day. The new balance called for produc- 
tion of 1,044 units per day. This was effected 
by dropping one of the seven assembly work- 
ers from the line. Her job was redistributed 
among the remaining assemblers so that fol- 
lowing the change the jobs of five of the six 
remaining assemblers were changed in some 
procedural details. 

Following the change, the two groups re- 
mained at their new jobs for approximately 
4 weeks at which time the plant went on a 
plantwide one-shift operation and the study 
was over. 

Data on the quality of workmanship dur- 
ing the course of the study are presented 
in Table 1. The percentages in this table 
represent the proportion of the total number 
of units assembled which required repairs 
due to operator error. Each of these figures 
is based on the mean of those days during 
any particular experimental phase, for which 
data exist for both experimental groups. On 
many days one or more of the regular op- 


TABLE 1 
THE QUALITY or Work DURING THE 
LOUISVILLE I EXPERIMENT 


Percentage of 
Assembled Units 
Pind oike Xo. of | Requiring Repair 
Experiment D peched a 
(Favored paoired 
Unit Unit 
Premanipulation 
Early Manipulation 4 10.6 11.8 
Late Manipulation if 11.7 147 
Changeover 
First Week 3 
: 3 2141 4 
Second Week 5 13.8 vs 
Reaminder 8 1 1.6 20.0 


? For reasons discussed earlier, the 
place on a day in the middle of the w Tn thi 
. In this a E 
lowing tables the period called “First Week” mada die die 
from the day of changeover to the Friday of the same week. 


changeover always took 


e 


Schachter, Willerman, Festinger, and Hyman 


erators was absent from her station. If it was 
impossible to find a thoroughly experienced 
replacement (as it almost always was) the 
data for this day’s work was automatically 
discarded. Since inspection of the data made 
it clear that there was a high day-by-day 
correlation in the workmanship of the two 
groups (due to such factors as a run of bad 
parts) the data in this and all following tables 
are based wholly on the figures for those days 
for which the data exist for both experimental 
groups. The number of such days is presented 
in the column headed “Matched Days.” It 
should be noted, though, that the magnitude 
of the differences between groups in this and 
all following tables is much the same whether 
the figures are based on matched days or on 
the total number of days for which data 
are available. : 

It will be noted that there are no entries 
for the normal or premanipulation period. 
This unfortunate state of affairs is due to the 
fact that for most of this 11-day period, one 
or another of the four data collectors was 
home sick with the Asiatic flu. In order Hs 
give some notion of the pre-experimenta 
difference between these two groups, We have 
divided the manipulation period into t: 
phases: the early phase or first week e 
manipulations, at which time the difference 
in emotional disruption of the two po 
must still have been relatively small, vs 
a late manipulation phase or the last 2 wee F 
of the manipulation period. It will be Lr 
that the two groups are virtually identica” 
during the early manipulation period differing 
by only 1.2%. And for the late eR. 
period, the two groups are still quite gen 
differing by only 3%. The evidence is m i 
then, in support of our first hypothesis. i 
a time that work is stereotyped, emotiona 
state has virtually no effect on workmanship: 
A day-by-day comparison of these two groups 
during the total manipulation period indicates 
striking similarity between the two groups. 
For 8 of the 11 matched days the two groups 
differ by 2% or less. 

Following the changeover, the difference 
between the two groups is marked. In the 
first 3 days following the changeover, the 
Favored group increases its errors by 9.490 
while the Disfavored group increases bY 


i 


4 


Emotional Disruption and Industrial Productivity 


16.795. In the following weeks the difference 
between the groups grows even larger for the 
Favored unit quickly returns to its prechange 
level of errors, while the Disfavored unit 
remains at a level almost twice that of its 
prechange level for the remainder of the 
study. For the entire manipulation period, the 
Disfavored group made 2.3% more errors 
than did the Favored group. For the entire 
postchangeover period, the Disfavored unit 
made 16.7% more errors than did the Favored 
unit. The evidence is good, then, in support 
of our second hypothesis—when changes in 
working procedures are introduced, emotion- 
ally disturbed workers will have more diffi- 
culty at their work than will relatively un- 
disturbed operators. 

So far we have considered the effect of 
the manipulations on the quality of work- 
manship. With respect to quantity of output, 
during the several phases of the experiment, 
these groups differed from one another by 
less than 1% in the number of units pro- 
duced. Following the change, both groups 
came up to the scheduled production rate 
quite rapidly. Both groups were within a few 
units of scheduled production on the days 
immediately following the change and by the 
fifth day both groups steadily produced at 
the scheduled rate. y . 

This pattern of marked difference in qual- 
ity and no difference in quantity 1S hardly 
surprising when the specific nature of the 
Operation is considered. It will be remembered 
that these groups worked at a paced conveyor 
Operation, Except for emergencies when the 
line could be stopped, the operators had no 
Control of the production rate. Any conspicu- 
ous number of missed units would bring 
Managerial pressure to bear, and unless an 
Operator was willing to precipitate a crisis, 
She had little choice but to do her job im, 

Necessary, a hasty fashion. j 
, Certainly the general theoretical considera- 
tions guiding this study would lead one to 
Suspect that, in work situations where it 1s 
Possible for quantity to vary, one should find 
Evidence that both quantity and quality of 
Work will vary with our manipulations. The 


Wensboro study was designed both to repli- 
1 to test this 


Cate the first experiment and 
Possibility, 


Owensboro 


The Owensboro study was conducted in a 
factory which manufactured vacuum tubes 
for electronic equipment. There were four ex- 
perimental groups. Two of these groups, 
known as cage units, were occupied with the 
assembly of the component parts of a tube. 
The remaining two units, known as weld 
units, were concerned with welding the prod- 
uct of the cage unit to a stem. Both pairs of 
units, of course, performed identical opera- 
tions. Each cage unit was composed of five 
female operators and each welding unit of 
eight female operators. All experimental units 
worked on the day shift. 

Both the cage and weld groups were, in 
good part, self-paced units. Unlike the Louis- 
ville I setup, the individual operators on these 
units could, within limits, establish their 
own rate of work. 

Production data were collected for 215 
weeks before the beginning of the manipula- 
tions. The manipulations themselves lasted 4 
weeks. The chief theme of the Disfavored 
manipulations centered on the presumed dis- 
covery that the Disfavored units were re- 
sponsible for a contamination of tubes that 
resulted in the rejection of large numbers of 
the products of these units. The search for 
the precise source of this contamination 
allowed the manipulators to “legitimately” 
put these operators through a series of quite 
annoying incidents such as forcing the girls 
to wash their hands, to wear irritating. special 
equipment, and the like. In addition, the 
work of these operators was disparaged and 
they were submitted to a continuous stream 
of annoyances of one sort or another. As 
in the previous study, the Favored manipula- 
tions were based largely on a combination 
of praise, flattery, and friendliness. Judging 
from the protocol dictated by the manipu- 
lators and supervisors, the manipulations were 
effective in producing the differential states 
of dissatisfaction required for the experiment. 

The changeover was made by changing the 
tube type on which these lines worked. This 
was a new tube for all of these operators and 
every girl's job was changed. Productivity 
data were collected for approximately 4 weeks 
after the change. 


208 


TABLE 2 


Tue QUALITY OF WORK IN THE CAGE Units 
DURING THE ÜWENSBORO EXPERIMENT 


Schachter, Willerman, Festinger, and Hyman 


TABLE 3 
AVERAGE HOURLY OUTPUT IN THE CAGE UNITS DURING 
THE ÜWENSBORO EXPERIMENT 


Percentage of As- 
sembled Units 
No. of Rejected by 
Phase of the Mutched Testing Machine 
Experiment Days : 
F. 1 Dis- 
Unt favored 
à Unit 
Premanipulation 4 0.41 0.24 
Manipulation 13 0.27 0.27 
Changeover 
First Week 3 0.18 0.95 
Second Week 5 0.52 0.54 
Remainder 6 0.43 0.15 


Let us examine, first, the data on the qual- 
ity of work which is presented in Table 2. 
This table presents data on the quality of 
Work of the two cage units? The figures in 
the table represent the proportion of the 
assembled units which are, owing to cage 
Operator error, rejected by the testing ma- 
chine. Examining, first, the prechangeover 
data we note again strong support for the 
first hypothesis. Where the difference between 
the two groups is relatively small before the 
manipulations, it stows even smaller during 
the manipulation period. Clearly, emotional 
state has little impact on stereotyped work. 

Immediately following the changeover, the 
effects are similar to those in the Louisville 
I study. Where the two groups made identical 
proportions of errors before the change, after 
the change the Disfavored group makes more 
than five times the errors made by the 
Favored groups. Unlike the Louisville I 
study, however, the effect is brief for by the 
second week this difference between the two 
groups has vanished and in the final days 
of the experiment the two groups are back, 


? Unfortunately, quality data on the weld units 
are not available. Unknown to the experimenters, 
whe foreman added extra repair operators to the 
weld lines for the first few days of the changeover 
period. Since any quality differences might as well 
be due to differences between these repairmen as to 
differences in the experi 


mental groups, the weld 
quality data must be treated as meaningless, 


Percentage of 
Average — | Production dur- 
Hourly | | ing Manipula- 
No. of | Production | “tion Period 
: Matched 
Experiment Days : 
" Dida: Disfa- 
Fayored Dian. red Dista 
Unit YorediRe vored 
Premanipulation | — 5 265 255 
Manipulation 13 293 288 E 
Changeover 
First Week 4a 143 120 | 48.8 ‘ ir 
Second Week 5 168 158 | 57.5 pto 
Remainder 5 20 204| 71. i 


incide exactly with 
5 The number of matched days do not coincide vog frs a 
those in Table 2. Operational difficulties Koran. y ma 
: i i a articular day. 
impossible to collect quality data for a particular day 


approximately, to their prenanpiletion iEn 
of workmanship. We shall consider this 2 
ference between the Louisville I and Owe D 
boro studies after presenting additional data. 
What about the quantity of output? Un- 
like the Louisville I operation, the ie 
in Owensboro did involve some cg A 
worker control of output. The relevant a 
are presented in Table 3 for the cage u E 
and Table 4 for the weld units. The - : 
in the tables represent the average wood 
of units produced per hour. In an cs 
we note precisely the same ers glo ph. 
premanipulation periods the —— tpe 
favored groups in the two pairs fern io 
fairly well matched in production. g 


TABLE 4 

JE VI " " G 

AVERAGE HOURLY OUTPUT IN THE WELD UNITS DURING 
THE OWENSBORO EXPERIMENT 


age of 
Average — | production dur- 
Macon | ing Manipula- 
No. of Production tion Period 
Thase of the | Matched 
experiment Days ma 
Tavored Disfa-|_+ Disfa- 
Favored vored|Favored — vore 
Unit VE s 
Premanipulation 7 223 — 207 
Manipulation 14 Bj 30 
Changeover 
First Week 2 88 7S | AL 34. 
Second Week 5 124 024 | 58.2 56.4 
Metis 6 141 148 | 662 67.3 


z= 


A 
oe 


$5 
1 


Emotional Disruption and Industrial Productivity 209 


manipulation period the difference between 
matched groups remains small. Again the evi- 
dence is good that emotional disturbance has 
little éffect on stereotyped behavior. 

Immediately following the change the two 
Favored units produce at a better rate than 
do their Disfavored counterparts. The differ- 
ence of 23 units in Table 3 and 13 units in 
Table 4 between Favored and Disfavored 
groups during the first week of change may 
not appear to be large but it should be kept 
in mind that these are hourly production fig- 
ures. Over an 8-hour working day during the 
first week of change, the Favored cage unit 
produced a daily 184 units more than did the 
Disfavored group compared to a daily 51-unit 
advantage for the Favored group during the 
combined manipulation and premanipulation 
periods. Similarly, the Favored weld unit had 
a daily advantage of 104 units over the Dis- 
favored weld units during the first changeover 
week compared to its daily 5-unit advantage 
during the combined manipulation and pre- 
manipulation periods. . 

Correcting the absolute rate of production, 
by the prechange rate, it can be seen (lower 
right section of Tables 3 and 4) that during 
the first week of the changeover the Favored 
cage unit is producing at 48.8% of the rate 
it was producing during the manipulation 
period, while the Disfavored cage unit pro- 
duces at 41.7% of its previous rate. A simi- 
lar difference exists for the two weld groups. 
It would appear, then, that our manipulations 
have affected both the quality and quantity 
of work at changeover times. 

As with the quality data, these effects on 
Output are temporary. By the second week 
after changeover, the differences between 
Favored and Disfavored group grow quite 
small and by the third week the two pairs of 
Sroups are "producing at virtually identica 
"ates, Why are the effects in the Owensboro 
Study temporary while those in the a 
Study appear to be longlasting? An exp ae 
tion for this difference seems to us reasonably 
ap Jare . 

j^ iu be remembered, first, that owing 
lo the paced-conveyor working arrangement, 

Ne experimental effects in the Louisville I 
Study are limited to the quality of workman- 
Ship while in the Owensboro study both qual- 


ity and quantity are affected. If one examines 
the nature of the product produced in each of 
these operations, an explanation of the tem- 
porary vs. long-lasting effect seems reasonably 
clear. The Owensboro plant manufactured 
vacuum tubes, a product which is virtually 
nonrepairable once the mount has been sealed 
in its glass or metal housing. Since bad parts 
and errors in workmanship are inevitable, the 
plant maintains an extensive testing operation 
to insure that defective tubes are not shipped 
out to its customers. Every tube produced is 
tested and the nature of its defect catalogued 
and assigned to the particular line which pro- 
duced the tube. Of necessity, the plant is 
exquisitely quality conscious and every fore- 
man, supervisor, and line operator knows how 
many defective tubes they produced each day. 
Inevitably considerable pressure is brought to 
bear on lines which deviate from an accepted 
maximum of rejects. 

In the Louisville operation, on the other 
hand, a defective unit is easily remediable by 
a repair operator. Owing to this fact the plant 
kept no records of quality and the only per- 
sons in the entire plant who could have even 
a rough idea of the quality of workmanship 
on our experimental lines were the repair 
operators. Since these operators were appar- 
ently able to handle this extra work and 
appear to have said nothing about it, no 
pressure was put on the lines to remedy their 
workmanship. 

Our thesis, then, is that absence of super- 
visory pressure perpetuated the differences 
between the experimental lines in the Louis- 
ville I study. If this is correct, we should 
certainly expect that were the foreman to put 
pressure on the offending Louisville line, their 
errors would decrease. And the evidence indi- 
cates that this is indeed the case. Four days 
before the end of the Louisville I.study,® the 
experimenters asked the foreman of the pre- 
viously Disfavored group to pressure this 
group about the quality of their work. He 
simply went over to some of the assemblers 
on this line, told them that they were making 
too many errors, far more than the night 
shift, and asked them to try to improve. He 


6 The data for these 4 days are not included in 
Table 1. 


210 Schachter, Willerman, 
worked briefly with one of the operators sug- 
gesting changes in her work procedure and 
then left. In the period immediately preceding 
this interlude (the period called “Remainder” 
in Table 1) these operators had been pro- 
ducing 29.075 defective units. In the 4 days 
following the foreman's intercession, they re- 
duced their level of errors to 17.3%, a figure 
only slightly greater than that during the 
manipulation period. It would appear, then, 
that the effects of our manipulations are, with 
time and effort, correctible. 


The Owensboro study, then, increases our 
confidence in the basic phenomenon under 
test. In two independent comparisons, the 
results of the Louisville I experiment repli- 
cate. In addition, the results of this study 
add two items of information to our knowl- 
edge of the phenomenon. First, emotional dis- 
turbance will affect the quantity as well as the 
quality of work, Second, the effects on work- 


manship appear to be remediable by super- 
visory pressure. 


Louisville II 


The Louisville IT study was designed first 
as an additional replication of the basic ex- 
periment and, second, as a means of making 
additional methodologically required compari- 
sons. The reader has undoubtedly noted that 
before one can have any real confidence in 
the interpretation of the phenomena under 
study one would require further experimental 
control and comparison groups. It would be 
desirable, for example, to have a control 
group going through precisely the same se- 
quence of events but with no manipulation 
of emotional state. More im 
one would want some s 
through the mani 
but who Were no 
Perhaps the gro 
has simply come 


= ngzeover. 
Four small assembly lines Served as the 


experimental groups. Two of these lines were 


Festinger, and Hyman 


involved in the assembly of heater units for 
home laundry drying machines and two of the 
lines assembled motor-blower units for these 
same drying machines. There were five op- 
erators employed on each of the lines and as 
in the Louisville I study the lines were largely 
paced conveyor operations. The lines oper- 
ated on day and night shifts and the manipu- 
lations were so arranged that for the heater 
lines the day shift group was the Disfavored 
group and the night shift the Favored group. 
The reverse arrangement, of course, was made 
for the motor-blower lines. 

The changeover was so organized that on 
each line some of the jobs were changed and 
some were not. For each comparable pair of 
lines, of course, the same positions on 
changed. All told, the jobs of 10 of the 
operators were changed while 10 of the jobs 
remained the same. . 

The sequence of events in this study was 
precisely the same as that in the € 
studies. Unhappily, at the time of this study, 
a particularly rigid schedule forced us d 
compress all phases of the study and we. n 
not devote nearly as much time to the manipu 
lation of emotional state. As a aee ig 
there appeared to be, at best, only slight a 
ferences in emotional state between the ta- 
vored and Disfavored groups. Protocol dic- 
tated by the manipulators and sp 
personnel involved gave no indications : 
the Disfavored groups were growing a 
and no evidence of differential emotiona ni 
turbance between the experimental groups. i 
terviews with and observation of individua 
operators on the line corroborated this im- 
pression. . 

I Although the manipulations had failed im 
produce groups that were differentially we 
fied and dissatisfied, it was possible to classify 
the operators, by means of ratings, into satis- 
fied and dissatisfied categories. During e 
study, the experimenters’ two assistants ha! 

interviewed all of these operators, talked with 
them extensively, and observed them at work 
daily. These two assistants independently 
rated each worker on a five-point “dissatisfac- 
tion” scale, Their ratings were correlated .84. 

To analyze the productivity data, each op- 
erator in the day shift was paired with his 
counterpart in the night shift. Thus, within 


A 


2 ^ 
i 
| 


Emotional Disruption and Industrial Productivity 211 


each pair the operators were matched for the 
kind of work they performed. The pairs of 
operators were sorted into two classes: one 
class contained those pairs whose jobs had 
been directly affected by the changeover, and 
the other class contained those pairs whose 
jobs were not changed. Then, the more dis- 
satisfied member of each pair was compared 
with his counterpart to see how each reacted 
to the changeover. 

Table 5 presents the data on the quality of 
work of operators whose jobs were changed. 
It will be noted that before the changeover 
the Satisfied workers consistently make more 
errors than do the Dissatisfied operators. 
After the changeover this is reversed with the 
Dissatisfied operators more than tripling their 
rate of errors while the Satisfied operators in- 
crease their errors by only some 50%. 

Similar data for those operators whose jobs 
were not changed is presented in Table 6. 
Before the changeover time, Dissatisfied op- 
erators make more errors than do Satisfied 
ones, After the changeover time, the difference 
between these two groups of operators is 
even smaller than before. Clearly, the states 
of satisfaction and dissatisfaction have af- 
fected the workmanship only of those opera- 
tors whose jobs were changed. 

Taken by themselves the trends noted in 
Tables 5 and 6 would, of course, be considered 
extremely tenuous because of the post hoc 
nature of the analysis. When considered along 
with the results of the previously described 


TABLE 5 
Tue RerLATIONSUIP OF WORKER SATISFACTION AND 
DISSATISFACTION TO THE QUALITY OF WORK DURING 
que LoursviLLE II EXPERIMENT 
(Workers whose jobs had been changed) 


I— 
Percentage of Assembled 
Units Requiring Repair 


Yo. of 
Phase of the NR [EON D 
pee Days Satisfied Dissatisfied 
Workers Workers 
Premanipulati 3-4 140 0.56 
anipulation 
Tanipulation 8-11 1.39 0.83 
— ie Se 
Changeover 
First Week 3-5 2.02 2.64 
= re ee m: 


TABLE 6 
Tur RELATIONSHIP OF WORKER ‘SATISFACTION AND 
DISSATISFACTION TO THE QUALITY OF WORK DURING 
THE LOUISVILLE II EXPERIMENT 
(Workers whose jobs had zot been changed) 


Percentage of Assembled 
Phase of the RL 2 Units Requiring Rework 
Experiment € Eun 
aX Satisfied Dissatisfied 
Workers Workers 

Premanipulation | 3-4 0,26 1.06 
Manipulation 8-11 0.27 1,82 

Changeover 
First Week 3-5 0.19 0.53 


studies, however, we can have considerably 
more confidence in these data and, indeed, 
they add one new item to our knowledge of 
the phenomena under consideration. A state 
of dissatisfaction in a worker, whether created 
experimentally or already in existence, will 
adversely affect his productivity only when 
his work is not stereotyped. 


DISCUSSION 


Though the results of any single one of 
these experiments must be treated as a case 
study owing to the small number of cases 
involved, the three experiments viewed to- 
gether do give us considerably more confidence 
in the hypotheses under test. In three experi- 
ments, involving four independent compari- 
sons, we find precisely the same pattern of 
results. Emotional disturbance has little effect 
on stereotyped activity, but does have a dis- 
rupting effect on nonstereotyped activity. 

Though the basic phenomena under test 
seem replicable and reasonably well estab- 
lished, we must admit that our understanding 
of the phenomena is, at best, crude and that 
our theoretical statement is a loose, though 
plausible, formulation which is married only 
roughly to the experimental test situation. 
At almost every step in this formulation we 
have made assumptions for which there is 
relatively little external support. We have, for 
example, assumed that the assembly operation 
during regular production periods is stereo- 
typed activity and during changeover times 


t3 


12 Schachter, Willerman, 
is not. Is this correct? Happily, on this point, 
relevant evidence is available. If this assump- 
tion is correct, we should expect that during 
regular operations the thoughts and conversa- 
tion of the workers will be largely irrelevant 
to the job at hand while during changeover 
times they will be much more concerned with 
their work. In the Louisville I study, a trained 
observer observed each of the experimental 
groups at work for half-hour periods every 
second working day throughout the 10 weeks 
of the study. Among other things, he cate- 
gorized all interactions as either work related 
(relevant to the job at hand or anything to 
do with General Electric) or nonwork related 
(kidding around, remarks about politics, per- 
sonal matters, weather, etc.). In Table 7 we 
have recorded the proportion of operator in- 
teraction that was relevant to work during 
the course of the study. The average number 
of interactions recorded during the half-hour 
observation periods over the course of the 
study was 36.8 for the Favored 
31.9 for the Disfavored group. 
It can readily be seen in this table that 
during the manipulation period interaction 
among the operators was predominantly ir- 
relevant to the job. In the period immediately 
following the changeover the proportion of 
interaction concerning work increased mark- 
edly, and then decreased steadily to pre- 
changeover levels. To the extent that this 
categorization of interaction is acceptable as 
an indirect index of stereotopy, our characteri- 
zation of the assembly process may be consid- 
ered as reasonable and supported by the data. 
Perhaps the chief ambiguity in this theo- 
retical scheme is the precise nature of the 
presumed link between emotional disturbance 


group and 


TABLE 7 


ERCENTAGE OF OPERATOR INTERACTION 
THAT was RELEVANT TO Work 


"Tug P; 


Manipü- After Changeover 
Group Jation | ——— ——————— — 
Period Days Days Days 
1-7 8-14 1521 
Favored 33 60 45 31 " 
Disfavored 24 50 35 25 


Festinger, and Hyman 


and nonstereotyped behavior. Such a link has 
been hypothesized but in no way has the 
mechanism of this relationship been elabo- 
rated. At this stage of investigation, many 
alternative explanations are possible. For ex- 
ample, it may be that the chief effect of the 
emotionally disrupting manipulations has been 
to diminish the motivation to do a really good 
job. Such an effect would be unlikely to affect 
already stereotyped behavior but might very 
well interfere with the acquisition of good, 
new work habits. Alternatively, it is possible 
that the state of fatigue consequent on the 
repetition of a task requiring concentration is 
chiefly responsible. Possibly when the indi- 
vidual is in such a state of fatigue or ex- 
haustion, his emotions and irritations are par- 
ticularly liable to affect his behavior. When 
behavior is stereotyped, fatigue is less likely 
and performance should not be affected. When 
repetitious behavior is nonstereotyped, the 
interaction of emotional disturbance and 
fatigue may lead to particularly deteriorated 
performance and the sheer repetition of such 
deteriorated behavior may again lead to the 
automatization of ineffective work habits. Still 
other alternatives are possible and it is clear 
that the precise understanding of the phe- 
nomena here demonstrated demands experi- 
mentation specifically directed to clarification 
of this relationship. 

Numerous other questions, of course, open 
up as well. For example, are these effects spe- 
cific to only disturbing emotional states such 
as the anger and hostility which we manipu- 
lated or do they generalize to such strong, but 
benevolent, states as joy and euphoria? This 
question and others are currently under in- 
vestigation in laboratory experiments being 
conducted by J. Arrowood, B. Latané, and B. 
Schuler (1960 unpublished). 


SUMMARY 


Though the introduction of new work pro- 
cedures is a frequent event in many industries; 
the smoothness of the transition from one 
working procedure to another is usually un- 
Predictable. Sometimes rebalancing an as- 
sembly line proceeds with no difficulty; at 
other times productivity drops precipitously 
after a change and it requires weeks for a 


Emotional Disruption and Industrial Productivity 213 


work group to reach expected quality and pro- 
duction goals. Engineering, planning, super- 
visory, and psychological factors are all in- 
volved in such a change, and it was the pur- 
pose of this study experimentally to examine 
the effects of emotional factors on the success 
of a planned change. 

It is commonly assumed that emotional 
states such as anger and hostility are dis- 
ruptive of performance. It is here hypothesized 
that such emotional states will be maximally 
disruptive of behavior that requires thought 
and concentration; but will have little effect 
on behaviors that are stereotyped, that is, be- 
haviors such as walking or eating that are so 
well mastered and habitual that they require 
neither attention nor thought. It is suggested 
that, once mastered, the typical assembly op- 
eration is of precisely this stereotyped char- 
acter. The effect of introducing a change in 
working procedure is to convert the assembly 
operation from a completely stereotyped op- 
eration into one which requires, for a time, 


complete concentration. This analysis suggests 
the following hypotheses: 

1. During regular factory operations, when 
no procedural changes are underway, the qual- 
ity and quantity of production will be little 
affected by wide variations in emotional states 
disturbing the operators. 

2. At times when changes in working pro- 
cedure are being introduced, emotionally dis- 
turbed workers will have more difficulty 
making the transition than will relatively un- 
disturbed operators. 

To test these hypotheses, three independent 
field experiments were conducted on assembly 
groups working in General Electric factories. 
The results of all three studies support the 
hypotheses. 

REFERENCES 
Cocu, L., & Frencu, J. R. P. Overcoming resistance 

to change. Hum. Relat., 1948, 1, 512-532, 
Karsten, A. Psychishe Sättigung. Psychol. Forsch, 

1928, 10, 142-154. 


(Received December 5, 1960) 


l oj Applied Psychology 
Tero of ae No. 4, 214-221 


TARGET TRACKING AND ACQUISITION IN THREE 
DIMENSIONS USING A TWO-DIMENSIONAL 
DISPLAY SURFACE’ 


CHARLES S. MORRILL axb BARBARA L. DAVIES 


MITRE Corporation 


A great deal of experimentation has been 
reported concerning display-control com- 
patibility and its effect upon operator per- 
formance during target acquisition and target 
tracking tasks (Andreas, Finck, Green, 
Smith, & Spragg, 1959; Ellson, 1947; Ely, 
Thomson, & Orlansky, 1956; Fitts, 1951; 
Tufts College, 1952, Part 6, Ch. 3, Sect. 4). 
Most of this past work is concerned with 
display movement in only two dimensions. 
On the other hand, operator difficulties en- 
countered in tarket tracking and/or acqui- 
sition in three dimensions, azimuth, range, 
and elevation, on a two-dimensional display 
Surface, remain relatively unexplored. The 
objective of the present study is to investi- 
gate the effects of four different display- 
control systems upon operator performance 
during target acquisition and target tracking 
tasks using three dimensions. 

In this study target azimuth 
Were represented by a symbol (a 
generated by one channel of a Dumont 
dual-beam scope and capable of moving 
along the x and y axes simultaneously, 
Target elevation was represented by a sym- 
bol (a short vertical line) generated by the 
Second scope channel and capable of vertical 
movement along the right-hand Strip of the 


display surface. The display symbols which 
represented the 


be referred to a 


and range 
single dot) 


s the azimuth-range symbol 
atic display is 


made by Dunlap and 
a hand-control design 


+The research reported in this article Was sup- 
ported by the Department of the Air Force under 
Air Force Contract AF-33(600) 39852. The original 


version of this article was Published as a MITRE 
"Technical Series Report, January 1961, 


in an airborne radar system. This recommen- 
dation suggests a display-control configura- 
tion which is a functional representation of 
the information. As a result, an incompatibil- 
ity exists between the direction of the as 
played motion of the azimuth-range Ar 
in range and the elevation symbol on e ia 
play surface; ie. a forward motion o : 
hand control results in an upward motion o 
the azimuth-range symbol, while a similar for- 
ward motion of the thumbwheel results in a 
downward motion of the elevation symbol. F 

A survey carried out by Morrill Y 
Sprague (1960) indicated that there i a 
over-all preference for a display whic bd 
direct representation of the hand-co us 
movements. This direct representation e 
ploys a compatible system rather ies 
incompatible system which resulte Re 
Dunlap’s functional representation o Be 
equipment. (A system is compati E d 
identical directional movements of t oe 
control and the thumbwheel produce 
ments of the azimuth-range a oap m 
elevation symbol that are the sam TOR 
rection on the scope.) JE irae o tesis 
of subjects indicated a direction-o rp: Fx 
preference, namely, a backward monon s 
hand control and thumbwheel to produce b. ; 
ward movements of the symbols of the 
play. Subjects in the survey study were as * 
to state their display-control prom 
order to acquire specific targets, whereas pss 
manuscript reports actual performance e 
ing target acquisition and target trac s t 
tasks when both compatible and a 
display-control relationships were us : 

The results of the study described in the 
Present report are generally applicable in 
systems which require information concern- 
ing the display-control relationships appro- 
priate for the operation of dynamic controls 
and in systems which provide for manual 


214 


SS 


Target Tracking and Acquisition 215 


TARGET 
AZIMUTH-RANGE 
POSITION 
TARGET 
ELEVATION 
POSITION 


ELEVATION 
SYMBOL 


AZIMUTH-RANGE 
SYMBOL 


Fic. 1. Static display. 

hand control movements as either the pri- 
mary or the back-up mode of operation. For 
example, in the design of the display and 
control portions of an airborne radar sys- 
tem, the data from this study may serve to 
answer questions which arise regarding the 
compatibility of the movements of a hand 
contro] with the corresponding movements 


of the display symbols. 
METHOD 


Subjects 

ts each were used. The 
listed personnel from 
Hanscom Air Force 


Four groups of 10 subjec 
Subjects were Air Force en 
the 3245th AC&W Squadron, 
Base, Bedford, Massachusetts. 


Tasks 


Each subject performed 


l, the subject was instructe 
(ke, to achieve lock-on) as quickly and accurately 


as possible, This task required that e e 
Place the azimuth-range symbol (a smal m 
around the target and similarly superimpo 


two tasks. During Task 
d to acquire the target 


elevation symbol on the target. Each subject was 
ven four practice trials prior to 64 test trials. 
Task H, for each subject, was to track a moving 
target in azimuth, range, and elevation for a period 
of 5 minutes during each of four trials. 

Groups A and B used compatible display-control 
systems for both tasks, whereas Groups C and D 
used incompatible display-control systems. The dis- 
play-control systems that were used are listed in 
Table 1. 


Apparatus 


Each subject was seated in front of a dual-beam 
cathode-ray-tube display (CRT) mounted on a 
console. Also on the console there appeared four 
lights, one each to indicate coincidence for the 
individual dimensions and one to indicate the 
achievement of lock-on. Coincidence was defined 
as that period of time during which the target and 
the hand control had equivalent positions for that 
particular parameter. Lock-on was defined as that 
period of time during which there was coinci- 
dence in azimuth, range, and elevation simultane- 
ously. When lock-on was achieved, all four lights 
were lit. 

The monitors console, a CRT, was mounted 
with three clocks, which provided a measure oí the 
time during which a subject achieved coincidence 
in the parameters 'of azimuth, range, and elevation, 
From another clock mounted next to the target 
position selection panel, data could be obtained 
concerning the time necessary to acquire lock-on 
or the duration of lock-on to the target. During 
the target acquisition phase of the experiment, 
Task I, the experimenter inserted the desired target 
parameters by means of the push-button target 
position selection panel, there being eight possible 
selections for each of the three dimensions— 
azimuth, range, and elevation. The subjects’ per- 
formance in azimuth and range was monitored 
by means of the monitor's scope. A second scope 
provided the experimenter with information con- 
cerning the subjects’ performance in the elevation 
dimension. In addition, during the target acquisi- 
tion phase initial reversals, defined as initial move- 
ments in range and elevation which were in a 
direction opposite to where the target appeared 
on the scope, were recorded by a six-channel San- 
born recorder. 


TABLE 1 
DisPLAY-CONTROL SYSTEMS 
Azimuth-Range Elevation 
é x ay Symbol Thumbwheel Symbol 
roups 
Backward Ui 
Backward Up p 
A ax Sackward Down Backward Down 
B n Backward Down Backward Up 
ü ‘0 Backward Up Backward Down 


216 Charles S. Morrill and Barbara L. Davies 


THUMBWHEEL 


/ E 


Fic. 2. Hand control. 


The hand control used by the subjects in this 
study, as shown in Figure 2, was a handgrip with 
& serrated wheel mounted at the top. Pivotal rota- 
tion of the handgrip to the left and right or 
forward and backward produced movement of the 
azimuth-range symbol (a small circle), respectively, 
in azimuth and range. Forward and backward 
rotation of the thumbwheel produced movement of 
the elevation symbol. 

Pressure on the trigger of the hand control acti- 
vated the display system. The ratios of movement 
of the hand control and thumbwheel to movement 
of the azimuth-range and elevation symbols were 
as follows: (a) +1° movement of the thumbwheel 
produced +1 millimeter displacement of the eleva- 
tion symbol; (b) = 1° movement of the hand con- 
trol in the azimuthal and range directions produced, 
respectively, £1.33 millimeter displacement of the 
azimuth-range symbol in azimuth and range. 

The dynamic target was produced by using two, 
low frequency function generators, The azimuth 
signal was a triangular wave generated at a fre- 
quency of 02 cycles Per second; the range and 
elevation signals were triangular waves generated 
at a frequency of 01 Cycles per second. 


Target Selection 


As noted above, 64 trials Were given to each 
subject during Task I. The equipment was con- 
structed with push-button, target Position selectors, 
eight each for azimuth, Tange, and elevation, so 
that 512 different target positions Were possible. 


The objective oi this experiment was to investigate 
display-control compatibility in range and [ie 
only; thus, 64 trials were administered to each 
subject in order to include all oí the 64 range- 
elevation combinations. . 

Range. Range was measured from the horizontal 
diameter oi the scope with four range positions 
above this diameter and four range positions below. 
Absolute deviation was measured as a perpendicu- 
lar drop to the diameter. The two range partions 
which appeared nearest the diameter, one auBNe 
and one below, were assigned an absolute-devia- 
tion level oí I. Successive range positions above 
and below the diameter were assigned absolute- 
deviation levels of II, III, and IV, respectively. 

Elevation. An initial point in elevation was de- 
fined as a center point with four elevation ponia 
above and four below this origin. Absolute Re 
tion from the elevation origin was patents Pia 
exactly the same manner as described above 

"i 
pcm Four absolute-deviation levels en e 
signed to azimuth in the same manner ale cin 
for range. Absolute deviation in this Case à | 
was measured as the perpendicular distance to 
zimuthal diameter. . -— 
‘ Sixteen range-elevation deviation combina ion 
resulted from considering all possible eg a 
the range and elevation sbslutesdevistun e 
Four azimuthal positions, two from the ul Ea 
two from the right, were selected for each o Picus 
16 combinations. Each of the four absolute- Es 
lion levels in azimuth was represented gute A 
each of the 16 range-elevation pairings. ʻ gn 
each range-elevation pairing, the compose, Mes 
tion level from the azimuthal diameter s 
left azimuthal positions was equal to uml 
posite deviation level of the two right d B Far 
positions. A total of 64 trials resulted, M the 
azimuth parameter equally distributed among 
range-elevation deviation combinations. 


Scoring 


During Task I, two measures were hon end 
time to acquire lock-on and composite asd A 
versals, defined as the sum of initial €— a 
range and elevation. An initial reversal ma e c: 
order to achieve coincidence in range was ne 
ered an error. Likewise, an error was recorded A 
an initial reversal was made to achieve coinciden 
in elevation. During Task II, lock-on time oie 
the length of time of coincidence in each of t 
dimensions were recorded. 


RESULTS AND DISCUSSION 


Task I 


Four groups of subjects were asked to ac- 
quire targets in azimuth, range, and eleva- 
tion as quickly and accurately as possible. 


p" 


Target Tracking 


Each group used a different display-control 
system. 

An analysis of variance was carried out to 
investigate differences among the four dis- 
play-control systems in terms of composite 
initial reversals in range and elevation. A 
statistically significant difference at the .005 
confidence level was found among the four 
groups. Bartlett's test of homogeneity of 
variance was carried out among the four 
groups. No statistically significant difference 
was found. 

In Table 2 are shown the results of 
Tukey's test (Bowker & Lieberman, 1959) 
to determine for which pairs among the four 
groups there was a statistically significant dif- 
ference between composite initial reversals. 
Only Groups B and D differed significantly. 
The performance of Group B, whose subjects 
used the display-control relationship which 
required pulling back on the hand control 
and thumbwheel to produce a downward mo- 
tion on the scope, was superior to the per- 
formance of Group D, whose members were 
required to pull back on the hand control 
in order to produce upward movement and 
to pull back on the thumbwheel to produce 
downward movement on the scope. . 

Further analyses of variance were carried 
Out to determine if for this sample the 
initial reversals in range and elevation con- 
tributed equally to the difference among the 
groups. The results indicated that there was 
a statistically significant difference for initial 


Tange reversals at the .005 level of confi- 


dence, but that there was not a significant 
i ups in terms of 


difference among the gro 
initial elevation reversals. These data sup- 
Dort the notion that at least for this sample 


and Acquisition 21 


the differences observed among groups were 
attributable to initial reversals with the range 
control (forward and backward stick move- 
ment) rather than to initial reversals with 
the elevation control (thumbwheel). One ex- 
planation of the significance of initial re- 
versals in range is found in the method which 
the subjects employed to acquire the target. 
In most cases, subjects moved the azimuth- 
range control before they moved the eleva- 
tion wheel. If the subject had learned a 
particular display-control system and yet 
made an initial reversal in range, the in- 
formation gained from the reversal action 
would aid him in achieving the correct re- 
sponse in elevation. 

The previously mentioned survey by Mor- 
rill and Sprague (1960) pointed out that 
there existed a preference for a compatible 
display-control system, i.e., for a forward 
or backward motion of both the hand con- 
trol and the thumbwheel to produce the 
same directional movements on the scope. 
Task I of the present study, which re- 
quired the subjects to carry out a target 
acquisiton task, seemingly does not support 
these previous findings. Both Group A and 
Group B used internally compatible display- 
control systems for range and elevation, while 
Group C and Group D used internally in- 
compatible systems for range and elevation; 
yet, the only statistically significant differ- 
ence in terms of initial reversals was be- 
tween one compatible and one incompatible 
system. There were no statistically sig- 
nificant differences between any of the other 
pairs. Further examination of the data pro- 
vided an explanation for the unique differ- 
ence between Group B and Group D. The 


TABLE 2 
MEAN COMPOSITE INITIAL REVERSALS AND ASSUMED ENOAR. DEVIATIONS; 
: R-VALUE CONFIDENCE LEVELS—TAsK I 
TENE Eu | jun 7 Soon i 
Matrix of &-Value Confidence Levels 
between Group Means 
Groups poo sme SD i B u E 3 D i 
NO we, uum A ns ns 
n 364 mo " E 
E 478 7.9 C * As 
D 55.0 9.3 f 


218 


two important factors in this explanation 
are: the compatiblity of range and elevation 
movements within each display-control sys- 
tem and the specific display-control relation- 
Ships for range. Groups A and B used dis- 
play-control systems that were internally 
compatible in terms of movements associated 
with both range and elevation; no difference, 
then, would be expected between these two 
groups. Groups C and D both used Systems 
that were incompatible within themselves; 
thus again no difference would be expected. 
Although Group A used an internally com- 
patible system and Group D did not, both 
groups used the same configuration to con- 
trol range. Since the effect of initial range 
reversals proved to be significant whereas 
the effect of initial elevation reversals did 
not, the lack of difference in composite initial 
reversals between Groups A and D may 
well be attributed to the similarity of their 
Tange configurations. This explanation is also 
appropriate for the lack of difference between 
Groups B and (E 
The performance of 
to that of Group D. G 


patible: 


e thumbwheel pro- 
al movements of the 
range and elevation symbols on the 
scope. Furthermore, for Group B, a back- 
ward movement of the hand control pro- 
duced a downward movement of the azimuth- 
range symbol. Since the effects of 
range reversals 

whereas the effec 
sals did not, it a 
used by Group 
ferred to the on 


initial 
proved to be Significant, 
t of initial elevation rever- 
ppears that the configuration 
B to control range was pre- 
r e used by Group D. Perhaps 
stimulus-response compatibility for range and 
elevation within a display-control system 
and, in addition, a second-order interaction 
of a specific stimulus-response configuration 
for Tange, namely, a backward movement of 
the hand control to produce a downward 
zimuth-range symbol, 
One group and neither 


Charles S. Morrill and Barbara L. Davies 


compatibility and the preferred range con- 
figuration—whereas Group D had neither. 
Thus a difference was demonstrated. Con- 
cerning Groups A and C, although Group A 
employed an internally compatible system 
and Group C did not, Group C, but not 
Group A, utilized the preferred range con- 
figuration. In this case, therefore, no differ- 
ence was found. : 

An analysis of variance was carried out 
to determine if there was a signficant differ- 
ence among the four groups in terms of 
time to acquire lock-on during Task I. No 
statistically significant difference was found. 


Task II 


In Table 3 are shown the results of an 
analysis of variance in terms of time in sec- 
onds that lock-on to the target was achieved 
during a total of four 5-minute trials. There 
was a statistically significant dere 
among the groups at the .05 confidence leve 
and between trials at the .001 confidence 
level. Bartlett’s test of homogeneity of vari- 
ance was carried out among the four groups. 
No statistically significant difference was 
found. 

Th Table 4 are shown the results of ZH 
key's test to determine for which pairs tei d 
the four groups there was a statistically sig 
nificant difference between total mean d 
in seconds of lock-on to the target. Groups 


TABLE 3 


ANALYSIS OF VARIANCE FOR E OF 
Lock-ON ro TARGET—TASK 


F 
Source of Variance df SS MS 
375 
Between groups 3 314033  10,467.8 
Between subjects 36  101,753.8 2,826.5 
in same group 
Total between 39  133,157.1 
subjects n 
Between trials 36,705.0 | 12,235.0 6% 
Interaction: groups 1,581.2 175.7 
X trials 
Interaction: pooled 108 — 21,171.8 196.0 
Subjects X trials 
Total within 120 59,458.0 
Subjects 
Total 159 — 192,615.1 


* Significant at -05 level. 
** Significant at .001 level, 


Target Tracking and Acquisition 


TABLE 4 
Mean Times or LocK-ON to TARGET AND ASSOCIATED STANDARD DEVIATIONS} 
k-vALUE CONFIDENCE LEVELS— Task II 


Mean Time of 


Matrix of -Value Confidence Levels 
between Group Means 


Groups Lock-on SD B C D 
A 154.5 30.5 A ns 0.5 ns 
B 127.3 32.2 B ns ns 
g 35.2 G ns 
D 125.4 28.6 


and C were significantly different from each 
other at the .05 confidence level. 

With the exception of the pairings of 
Groups B and D and Groups A and C, the 
reasons to explain the absence of statistically 
significant differences between pairings of 
groups are the same as for target acquisition. 
Concerning Groups A and C and Groups 
B and D, the results support the explanation 
that during the tracking task, as during the 
acquisition task, the stimulus-response com- 
Patibility for range and elevation within a 
display-control system and, in addition, a 
Second-order interaction of a specific stimu- 
lus-response configuration for range must 
be operative for one group and neither of 
these conditions operative for another group 
in order to produce a statistically significant 
difference between the groups. However, in 
COntrast to the target acquisition task, the 
Preferred range configuration for tracking 
Was a backward movement of the hand con- 
trol to produce an upward movement of the 
azimuth-range symbol. During the tracking 
task, Group A had both conditions operating 


—internal compatibility and the preferred 
range configuration—whereas Group C had 
neither. Furthermore, while Group B em- 
ployed an internally compatible system and 
Group D did not, Group D, but not Group 
B, utilized the preferred range configuration. 
Thus, a difference was found between 
Groups A and C and not between Groups 
B and D. The reason why subjects preferred 
one display-control configuration for range 
during target acquisition and a different dis- 
play-control configuration for range during 
target tracking remains an open question. 
Table 3 showed a statistically significant 
difference between trials. For all groups 
combined in Task II, trials were compared 
to see if practice improved performance dur- 
ing the four trial sessions, with improvement 
measured by mean time in seconds of lock- 
on to the target. By using Tukey’s method 
to determine which pairs among the four 
trials differed, it was found that there was 
a statistically significant difference at the 
.01 level between all trial means except be- 
tween Trials II and III, where the differ- 


TABLE 5 


Mean Triar Ti 


E or Lock-ON TO TARGET AND AssociarED STANDARD DEVIATIONS; 
k-VALUE CONFIDENCE LEVELS— Task IT 


All Groups Matrix of &-Value Confidenc 
RE Levels between Trial Means 
Trial A B G D M SD —All Groups 
rials 
2 3 Fi 
1082 295 1 01 ii T 

1 106.8 98.1 104.3 5 1 
2 Po. a ms DAA INS wi z A o 
3 164.0 133.8 121.9 133.1 138.2 27.8 3 ot 
4 182 1434 1364 139.7 1494 354 


ARGET M SECONDS = TASK IL 


OF tex- on 


wean Times 


TRALS 


Fic. 3. Mean times of lock-on to target by 


groups and trials— Task II. 


ence was significant at the .05 level. These 
data and standard deviations associated with 
trial means are presented in Table $. Ex- 
amination of these data show that the sub- 
Jects in all groups were in fact able to im- 
prove performance as a function of training. 
The duration and number of the trials, 5 
minutes for each of four trials, and the in- 
tensity of the task might have had an ad- 
verse effect upon performance. The positive 
effect of practice, however, appears to be 
more dominant than performance degrada- 
tion by fatigue. A pictorial representation, 
by groups and trials, of mean time of lock-on 
to the target is given in Figure 3. 


SUMMARY 


Charles S. Morrill and Barbara L. Davies 


1. During target acquisition, the display- 
control relationship which required that the 
subjects pull back on the hand control and 
thumbwheel to produce a downward motion 
on the scope was superior to the control 
which required that subjects pull back on 
the hand control to produce an upward 
movement and pull back on the thumbwheel 
to produce a downward movement on the 
scope. The difference in composite initial 
reversals which was found was attributable 
to stimulus-response compatibility of the 
range and elevation controls within a display- 
control system and to a second-order inter- 
action attributable to a specific display- 
control relationship for range, namely, a 
backward movement of the hand control to 
produce a downward movement of the azi- 
muth-range symbol. 

2. During target tracking, the display-con- 
trol relationship which required that the sub- 
jects pull back on the hand control and 
thumbwheel to produce an upward motion 
on the scope was superior to the control 
which required that the subjects pull back 
on the hand control to produce a downward 
movement and pull back on the thumbwheel 
to produce an upward movement on the 
scope. The difference in time of lock-on 
which was found was attributable to the 
stimulus-response compatibility of the range 
and elevation controls within a display-con- 
trol system and to a second-order interaction 
attributable to a specific display-control e 
lationship for range, namely, a backwar 
movement of the hand control to produce 
an upward movement of the azimuth-range 
symbol. 


REFERENCES 


Anpreas, B. G., Finck, A., Green, R. F,, Sut) 
S., & Spracc, S. D. S. Two-dimensional compense 
tory tracking performance as a function of E 
trol-display movement relationships, position" 
Vs. velocity relationship, and miniature vs. larg 

. Stick control. J. Psychol, 1959, 48, 237-246. 

Bowker, A. H, & LIEBERMAN, G. L. Engineering 
statistics. Englewood Cliffs, N. J.: Prentice-Ha! d 
1959, k 

DuNLAP & Assocrates. Design of hand control fo 
NAV/A1. Unpublished confidential report, 1957. 

Etrson, D. G. Independence of tracking in tem 
and three dimensions with B-29 (GE) pedest@ 
Sight. TSE-AA-694-26, August 8, 1947, Indian 


á 


1 


| 


, 


4l 
d 


Target Tracking and Acquisition 


University, United States Materiel Command, 
Wright Field. 

Ery, J. H., Tuousox, R. M., & Onraxskv, J. Lay- 
out of workplaces. USAF WADC tech. Rep., 1956, 
No. 56-171. (ASTIA Document No. AD 110507) 

Fitts, P. M. Engineering psychology and equip- 
ment design. In S. S. Stevens (Ed.), Handbook 
of experimental psychology. New York: Wiley, 


1951. Pp. 1287-1340. 


221 


Morr., C. S, & Spracve, Lipa T. Operator 
preferences for movement compatibility between 
radar hand control and display symbology. J. 
appl. Psychol, 1960, 44, 137-140. 

Turts CorrrcE, Institute for Applied Experimental 
Psychology. Handbook of human engineering data. 
Medford, Mass.: Tufts College, 1952. (Contract 
N6onr-199, T.O. 1) 


(Early publication received January 10, 1961) 


Applied Psychology 
dT o Rc No. 4, 222-224 


CONSUMER VERSUS MANAGEMENT REACTION 
IN NEW PACKAGE DEVELOPMENT 


MILTON L. BLUM ax» VALENTINE APPEL 


Marketing, Merchandising 


The last decade has been one of radical 
change in packaging as it has in many other 
areas of commercial life. A real part of this 
packaging revolution has been the contribu- 
tion which consumer research has made. In 
fact, the introduction of a new package 
without the benefit of consumer research 
evalution is becoming the exception rather 
than the rule. The study to be reported 
points up the importance of consumer re- 
search in such package development pro- 
grams. 

The writers’ firm was engaged to conduct 
à preliminary packaging study for one of 
its clients. The client's objective was to de- 
velop a package for a new product line. The 
product was intended for use by men but to 
be bought by women as a giít. 

The purpose of the study was to Screen, 
from a group of 18 design renderings sub- 
mitted, the designs which showed the most 
promise, and to indicate possible areas of 
design modification which might further im- 
prove the acceptance of the more promising 
of the design concepts. The principal inten- 
tion was ultimately to evaluate the more 
promising designs further based upon three 
dimensional prototypes and larger samples of 
respondents, 

Earlier research had detailed certain 
Specifications which the ideal package should 
meet. Among these was the decision that the 
package should appear both as masculine and 
relatively expensive, Moreover, women should 
Prefer it as a gift for their husbands, and men 
should prefer to receive it as a gift for them- 
selves, 

The Study was unusu 
consumers were 
management, and 
created the packa, 
designs from whai 
female consumer" 
therefore, the o 


3 al in that not only 
interviewed. The client’s 
also the design firm which 
Ses, agreed to evaluate the 
t they considered to be the 
S point of view. There was, 
Dportunity to compare the 


and Research, Incorporated 


judgments of designers, management, and 
consumers. 
MxzrHOD 


The study employed four independent groups of 
raters: female consumers (N — 80), male con- 
sumers (N —39), advertising and marketing ex- 
ecutives of the client company (N — 8), and the 
industrial designers who created the packages w= 
7). Each of the members of these groups us 
vidually rated a total of 18 different package e- 
sign renderings using Stephenson's (1953) Q sr 
technique. The 18 designs were rated in terms e 
the extent to which each design was perceived as: 
masculine or feminine, expensive or inongoa 
and appropriate or inappropriate as a male gift. 


The Q sort was performed by asking the E 
spondent to arrange the renderings into se E 
scaled categories, each category being aside ea 
score ranging from one to seven. For prd 
spondent, this resulted in a forced ge ae nae 
bution of scores for the 18 designs... This freq cu 
distribution was perfectly symmetrical, i. EE 
normality in shape, and had a modal a Es 
four which was assigned to six of the 18 de: um 
The forced frequency distribution and the sc 


assigned to each category were as follows: 
1 6|312]|1 
4 7 


Frequency 2 | 3 


Score 516 


i/2|3 


The advertising and marketing exe EE 
client company, and the members of ne E 
firm Q sorted the same 18 designs only on E 
basis of the extent to which they believed 23 
women would be willing to give each of the pe ^ 
ages to their husbands as a giít. This made Ps 
total of eight variables to be analyzed; three = 
for the male and female consumers, and b the 
for management and the designers. Because o i 
amount of time involved in rating the M 
each variable, it was not considered desirabl ate 
request the management and design groups to The 
the designs on more than one variable Run the 
ostensible purpose of asking management an any 
designers to complete the ratings was prim E 
as a device to explain to them the method 
ployed. 


RESULTS 
: ; k t 
Each design was assigned an overall ra 


ing for each variable which was the mea? 
222 


1 


e 
cd 


im. 


Consumer versus Management Reaction 


score for the group evaluating the designs, 
and the mean scores were converted into 
ranks for each variable. To measure the ex- 
tent of agreement and disagreement among 
the four groups of raters, the Spearman 
rank-difference correlations among the eight 
variables were calculated. 

The correlations for the gift ratings be- 
tween the men and women and between 
management and the designers were as fol- 
lows: .58 between the men and women, and 
.55 between management and the designers. 
From this it can be seen that there was fair 
agreement between management and the de- 
signers as to which packages they believed 
women would be more likely to prefer as 
gifts for their husbands. There was also 
fair agreement between the men and women 
as to which designs they would like to give 
and receive. Both the male and female con- 
sumers, however, were in substantial dis- 
agreement with the other two groups on this 
point. The correlations between the con- 
sumers vs. the management and designer 
groups were as follows : — .48 between the de- 
signers and the women, —.14 between the 
designers and the men, —.21 between man- 
agement and the women, and —.42 between 
management and the men. The reasons under- 
lying this disagreement can be understood in 
terms of the matrix of intercorrelations among 
all eight variables as shown in Table 1. 

Examination of this correlation matrix re- 
Veals two clearly defined clusters. The first 


223 


cluster is composed of the gift ratings of 
management and of the designers, and of 
the masculinity ratings of the male and fe- 
male consumers. The second cluster is com- 
posed of the gift and the expensiveness rat- 
ings of the consumers. The two clusters cor- 
relate negatively with each other. The one 
exception is the low positive correlation 
(.23) between the gift and masculinity rat- 
ings of the male consumers. 

The reason for this disagreement, between 
the consumers on the one hand and man- 
agement and the designers on the other, 
stems from the fact that these two groups 
were apparently using conflicting criteria in 
evaluating the designs. Management and the 
designers were evaluating the designs in 
terms of what the consumer perceived to be 
masculinity. Those designs which were per- 
ceived as being more masculine tended to 
be the same ones which the designer and 
management groups thought the consumers 
would prefer. The ratings of the consumers, 
on the other hand, tended to vary as a func- 
tion of what they considered to be the ex- 
pensive appearance of the design. 

In this particular case expensiveness and 
masculinity appear to be relatively incom- 
patible criteria, the correlation between them 
being —.73 among the women, and -— 47 
among the men. Since the two groups of 
raters tended to use one of these criteria to 
the relative exclusion of the other, the gift 
ratings of the consumers tended to corre- 


TABLE 1 
RANK DIFFERENCE INTERCORRELATIONS AMONG THE EIGHT VARIABLES 
(N = 18 designs) 


c 
3 4 $ 6 7 8 
Variable r z 
l. Masculinity-men oh 
2. Masculinity-women of 65 
3. Gift-designers 4 ‘67 55 
4. Gift-management 44 E 
, —.0: —44 
5. Gift-men p m m = 48 58 
6. Gift-women ats ‘61 — 29 47 AT 
7. Expensiveness-men md AB 49 AT 53 92 
8. Expensiveness-women —55 m : TN a 
Note With 16 degrees of freedom a correlation of 47 issignificant at the.05 level. A correlation of .59 is significant at the .01 
Dod 116 degrees E 


leve 


224 


late negatively with the ratings of the client's 
management and of the designers who created 
the designs. This is not to say that mascu- 
linity was completely unimportant among 
the consumer samples. It is to say that of 
the two variables, masculinity and expensive- 
ness, expensiveness was the more important. 
Actually, among the sample of males, mascu- 
linity assumes considerable importance when 
the effects of perceived expensiveness are 
partialed out or eliminated. The partial cor- 
relations between the gift ratings and the 
masculinity ratings, when expensiveness is 
partialed out, is: .58 for the men, and — .13 
for the women. The inference to be drawn 
here is that masculinity does contribute to 
preference on the part of the men when the 
effects of perceived expensiveness are elimi- 
nated. In the case of the women, masculinity 


appears to play no role at all in contributing 
to preference. 


Discusston 


_ The marketing implications of these find- 
Ings are clear. Had the packaging decision 
been made on the basis of the recommenda- 


Milton L. Blum and Valentine Appel 


tion of the design firm and on the pooled 
judgment of the client’s marketing manage- 
ment, the net effect would have been to 
select designs which would have had the 
least appeal so far as the consumers sam- 
pled were concerned. 

The result of the research was to outline 
specifications for the design group which 
would enable them to modify certain of the 
designs in ways which would cause them 
to be perceived by the consumer both as 
masculine as well as expensive. 

These findings point up the contribution 
which consumer research can make to the 
company involved in new packaging plans. 
Without the kind of information which con- 
sumer research can provide, management de- 
cisions concerning new package development 
remain much more of a gamble than most 
manufacturers can afford. 


REFERENCE 


Stepnenson, W. The study of behavior. Chicago: 
Univer. Chicago Press, 1953. 


(Early publication received February 16, 1961) 


í 


————— 


= 


é 


Journal of Applied Psychology 
1961, Vol. 45, No. 4, 225-231 


A COMPARATIVE STUDY OF PROGRAMED AND 
CONVENTIONAL INSTRUCTION IN INDUSTRY 


J. L. HUGHES axb W. J. MCNAMARA 


International Business Machines Corporation 


Studies of automated teaching or programed 
instruction (PI) in schools, colleges, and the 
Armed Forces (Lumsdaine & Glaser, 1960) 
have shown that this technique has consider- 
able promise in terms of reducing training 
time and teaching more effectively. At the 
time of writing this article, no comparable 
studies had been reported on the use of PI 
with industrial employees. Because of the im- 
plications of PI for industrial training pro- 
grams, a research project was undertaken to 
evaluate its feasibility and effectiveness in an 
industrial training situation by means of ex- 
periments at technical employee training cen- 
ters of a large company. This article will de- 
scribe the first experiment completed under 
this project, which compared the learning 
achievement of employee classes taught by 
PI in the form of programed textbooks with 
that of classes taught by conventional class- 
room instruction. The reactions of the ex- 
Perimental classes to the use of PI were also 
obtained. 


PROCEDURE 


In March 1960, a team composed of a train- 
ing center instructor and a psychologist was 
formed to prepare programed textbooks for 
the introductory section of a 16-week course 
On the IBM 7070 Data Processing System 
8iven to computer service men at a company 
training center. 

By September, five programed textbooks 
Containing 719 frames were completed. These 
Tames covered the first 15 hours of conven- 
tional classroom presentation. This amount of 
class time would be equivalent to 5 weeks of 
& 3-hour college course. The topics covered 
Were the names and functions of units of the 

70, bit coding, data flow, types of com- 

Uter words, and the program step. To test 

* effectiveness of PI in teaching this type 

Material, the following experiment was 
designed: : 


Two classes (1 — 42) which reported to the train- 
ing center during September 1960, were designated 
the control classes. They were taught the introduc- 
tory material of the course by two different instruc- 
tors using the conventional classroom method (lec- 
ture-discussion). This instruction covered a period 
of four mornings and totaled 15 hours, 3 hours on 
the first morning and 4 hours on each of the remain- 
ing three mornings. The afternoons of each day were 
spent on another phase of 7070 training. On the fifth 
morning, these classes were administered a compre- 
hensive 2-hour achievement test consisting of 88 com- 
pletion and multiple-choice items. This test was pre- 
pared by the program writing team with the co- 
operation of several training center instructors. A 
new test was necessary because no satisfactory ob- 
jective test of sufficient length was available for the 
part of the course taught by PI. 

Six classes (5 — 70) made up the experimental 
group. Two of these classes reported for training 
each month from October through December 1960. 
They were instructed solely by means of programed 
textbooks, which were substituted for the lectures 
and discussions of the introductory part of the course. 

The classroom time allotted for programed texts 
was reduced to 11 hours spread over a 3-day period, 
with 3 hours on the first day and 4 hours on each 
of the last 2 days. This reduction in classroom pres- 
entation time was based on fairly conservative esti- 
mates of the time needed for the trainees to com- 
plete the programed texts. The trainees were also 
permitted to take the programed texts home with 
them for evening study. 

The class instructors were directed to act as if the 
programed textbooks were part of the regular class- 
room procedure in order to minimize any possible 
Hawthorne effect. It was never mentioned to the 
students that they were participating in an experi- 
ment. The instructors confined their role to stating 
at the beginning of the first class period that this 
section of the course would be taught by five self- 
explanatory programed textbooks. They then passed 
out the first programed text. The third and fifth 
texts were passed out at the beginning of the second 
and third days of the experiment, respectively. The 
second and fourth texts were given to the trainees 
during the first and second classroom periods, re- 
spectively, after they had finished the texts passed 
out at the beginning of the period. 

The reason for deliberately pacing the completion 
of the five programed texts over the 3-day period in 
this manner v to a better administrative con- 
trol. This experimental design, however, prevented 
the faster students from finishing all of the texts be- 


225 


226 


fore the third day, and did not permit the direct 
measurement of the full saving in presentation time 
possible under PI. ere 

After passing out the texts at the beginning of 
each class period, the instructors retired to the back 
of the classroom and confined their activities to re- 
cording the number of frames that each trainee com- 
pleted in class. They were also instructed to answer 
as briefly as possible the questions asked by trainees. 
A record was kept of all questions asked. 

The experimental classes also took the same com- 
prehensive achievement test on the day following the 
completion of their instruction. In addition, they 
anonymously completed a Student Questionnaire ask- 
ing them to evaluate PI. The questionnaire consisted 
of five items with five-point descriptive scales meas- 
uring the effectiveness, difficulty, and acceptability of 
PI, and three open-ended questions asking for any 
general comments and any aspects of PI particularly 
liked or disliked. 

The control and experimental groups were run con- 
secutively rather than concurrently in order to reduce 
any contamination of results. Since members of both 
classes starting each month at the training center 
might come from the same company field office and 
might also room together, it was decided to elimi- 
nate the possibility that study materials would be ex- 
changed by control and experimental trainees during 
evening study periods. 

To avoid interference with the administration of 
the company training center, no attempt was made 
to assign trainees to class by random procedures, In- 
stead, men were assigned to classes as they were re- 
Ported available for training by their office managers 
in the field. In planning the experiment, it was 
anticipated that analysis of covariance procedures 
would make it possible to control on background 
variables which differed for the control and experi- 
mental groups and were correlated with achievement 
lest scores, 

In order to test the comparability of the control 
and experimental groups on various background data, 
Such as age, educational level, total months of ex- 
perience, and previous computer experience, data were 
collected by means of an Education and Experience 
Questionnaire. It should be noted that these groups 
generally consisted of well-sel 
men who had originally 
employment and who hai 
with the com 


soning abilit: 


RESULTS 
The subject matter cover 
ments took 15 hours of clas 


sent by the conventional 
method. The same informa: 


ed in these experi- 
sroom time to pre- 
lecture-discussion 
tion was covered 


J. L. Hughes and W. J. McNamara 


in 11 hours by programed textbooks, a sav- 
ing of 4 hours or 27% in classroom presenta- 
tion time. 

Tn response to an item on the Student Ques- 
tionnaire, 60% of the experimental class re- 
ported that PI required less home study than 
the conventional classroom method. Twenty- 
four percent reported spending the same 
amount of home study under both methods, 
and 16% stated that PI required more home 
study. These results indicated that the total 
reduction in study time achieved by the use 
of programed texts was actually more for most 
of the students than indicated in this experi- 
ment, which measured only the reduction in 
classroom presentation time. 

It should also be remembered that the 
amount of classroom presentation time for the 
experimental group was arbitrarily fixed to 
effect a conservative savings in classroom 
time. Because the programed textbooks e 
taken out of class by the trainees, records 0 
the actual time needed for completing the five 
texts could not be maintained. From the in- 
structors’ records of the number of frames 
that each trainee completed in class, luem 
it was possible to derive some annis E 
individual differences in the time required 
complete the program. A mean eqs ee 
time per frame was calculated for each traine íi 
On the basis of these figures, the mean com 
pletion time per frame for the entire E 
was calculated to be 49 seconds and the me 
ard deviation, 9 seconds. For the total e 
frame program, it was therefore estima = 
that the mean completion time was 9.8 hour: 
and the standard deviation, 1.8 hours. Indi- 
vidual differences in estimated completion 
time ranged from 7.2 to 15.3 hours. T 
the mean completion time was 1.2 hours PI 
than the 11 classroom hours allotted for Y- 
in this experiment, and there were war 
dividual differences in completion times. d 
finding suggested that even greater savings é 
instruction time would be possible for ie 
trainees if they used instruction on an in à 
vidual basis. Because of the experimental E 
sign used, these savings could not be directly 
measured in the present experiment. A 

A comparison of the aptitude test score 
and background variables for the control an 
experimental groups and their correlations 


Programed Instruction in Industry 


with the achievement test scores are given in 
Table 1. Of all the background variables, only 
the PAT scores showed a significant difference 
between the two groups and had a significant 
relationship with achievement test scores. The 
hypothesis that both groups were drawn from 
the same population on reasoning ability was 
rejected at the .05 level. The hypothesis of no 
relationship between reasoning ability and 
achievement test scores was rejected at the 
.05 and .01 levels for the control and experi- 
mental groups, respectively. Analysis of co- 
variance was used to test the significance of 
differences in residual achievement test scores 
after eliminating the effect of PAT scores on 
achievement. 

Table 2 shows the results obtained from the 
analysis of covariance (Walker & Lev, 1953). 
'The null hypothesis of no difference between 
the control and experimental group regression 
slopes was accepted (F — .107). The null hy- 
pothesis of no differences in residual achieve- 
ment test scores between experimental and 
control groups was rejected by F test at the 


227 


.01 level of confidence (F = 50.39). Thus, the 
obtained differences in achievement test scores 
could not be wholly attributed to differences 
in aptitude test scores (PAT). 

On the achievement test scores, the control 
group had a mean of 86.2 and a standard de- 
viation of 7.4. The experimental group had a 
mean of 95.1 and a standard deviation of 4.0. 
When the achievement test scores were ad- 
justed for the effect of PAT test scores, the 
control and experimental group means be- 
came 86.9 and 94.7, respectively (Table 2). 
The difference in adjusted means was 7.8, 
only slightly less than the difference of 8.9 in 
the unadjusted means. 

The standard deviations of the adjusted 
achievement test scores for the control and 
experimental groups were 7.0 and 3.8, respec- 
tively. An F test of homogeneity of variance 
rejected the hypothesis at the .02 level that 
both samples were from populations with the 
same variance. Thus, the difference in ad- 
justed achievement test means between the 
two groups could have been accounted for by 


TABLE 1 


COMPARISON OF CONTROL AND EXPERIMENTAL GROUPS ON BACKGROUND 
AND APTITUDE TEST VARIABLES 


r with Achievement 


Test Score 
Control Experimental 
Variable (n = 42) b = 70) Control Experimental 
i M 28.8 29.3 
Age —.070 —.025 
SD 5.7 54 
Education (7c ^ 07 = 
attended college) 26 d AU 145 
63.1 
Total months M 62.3 —.077 066 
experience Sp 57.2 38.8** 
Percent with 
papm or Pt. ce — 
d +t 
Programer M ake and 13* 333" 
Aptitude Test t 12.0 &5'* 


Rone Sl t at.05 level by £ test. 

Muted and 1 as scale values for th 
* Significant at the .02 level by F test. 
Significant at the .01 level by ! test. 


For education and previous computer experience, the product-moment coefficients were 


e independent variable. 


228 J. L. Hughes and W. J. McNamara t 


TABLE 2 


POEN s 2 Seokss 
ANALYSIS OF COVARIANCE OF APTITUDE Test (X) AND ACHIEVEMENT Test (V) SCORE: 
FOR CONTROL AND EXPERIMENTAL GROUPS 


P. 


Sums of Squares and Sums of Cross-Products: 


Experimental Grou Within Between 
a a dn = 70) p Groups Groups 
* 
p 6062 5008 11070 1280 p 
Dery 1165 786 1951 1645 Roe 
27 2284 1115 3399 2114 55 
Partition of Sums of Squares of Residuals: 
Source of Variation SS of Residuals df MS F 
Between adj. group means 1411 1 1411 T— 
Within common slope 3055 109 28 " . 
Between slopes 3 1 = i9 | 
Within slopes 3052 108 | 
Total 4466 110 


Adjustment of Achievement Test Means: 


Observed Mean 


Group 
Control 51.2 
Experimental 58.2 
Total 55.6 


* bv =common within-groups slope =1,951/11,070 


7.176. 
* Significant at .01 level of confidence. 


a difference in variance between the groups 
(Edwards, 1950). It was also noted that the 
difference in achievement variance was paral- 
leled by a difference in reasoning ability vari- 
ance as measured by the PAT (Table 1). 

In order to remove the possible effect of 
the initial difference in reasoning ability vari- 
ance on achievement variance, control and ex- 
perimental groups matched for PAT scores 
Were set up. This resulted in reducing the 


number in each group to 34 cases. The 
achievement test means and standard devia- 
tions for 


these matched samples were 86.7 and 
7.3 for the control group, and 93.9 and 4.6 
for the experimental group. The differences in 
means and standard deviations were found to 
be significant at the 01 and .02 levels, re- 
spectively, by ¢ test for matched groups. 
Therefore, the higher mean and lower vari- 
ance in achievement for the experimental 


————— Adjusted Y Mean 
Yi Yi — (X; — Xo» 
86.2 86.9 
95.1 94.7 
91.8 
TABLE 3 | 
v T 
DISTRIBUTIONS OF ADJUSTED ACHIEVEMENT TES 


SCORES FOR CONTROL AND EXPERIMENTAL ji 
GROUPS 
Experi- 
Control mental 
Group Group 
(n = 42) (n = 10) 
Adjusted Achieve- _——— Ss we 4 
ment Score Level N % N 70 
95 and above 5 12 4 A 
90-94 14 33 15 " 
85-89 9 22 s h 
80-84 7 17 3 
75-79 5 12 
70-74 1 2 
65-69 1 2 
Mean 86.9 94.7 
Standard Deviation 7.0 3.8 


b O—— 


Programed Instruction in Industry 229 


group did not appear to be due either to dif- 
ferences in reasoning ability level or vari- 
ability, but rather to the different teaching 
method used. 

Distributions of the adjusted -achievement 
test scores for the control and experimental 
groups are given in Table 3. The distribution 
for the experimental group indicates a con- 
centration of scores at the upper score levels. 
If a score of 95 or above is adopted as an in- 
dication of mastery of the subject matter 
taught, it can be seen that the experimental 
group had 6796 at this level or above, com- 
pared to only 12% for the control group. The 
PI group thus had more than five times as 
many trainees at the highest achievement 
level. 

On the Student Questionnaire administered 
anonymously to the six experimental classes, 
the replies of the trainees were very favorable 
to PI (Table 4). Of the total group of 70 


men, 87% liked PI more than conventional 
instruction, and 83% said they would prefer 
using it in future IBM courses. Only 6% 
liked PI less than conventional instruction, 
and 1356 would have some objections to using 
it in future courses. A possible reason for the 
size of the latter negative response was the 
impression of some students that PI would 
completely replace the use of instructors and 
class discussions in future courses. 

It was interesting to note that practically 
all of the trainees realized the advantages of 
PI over conventional instruction. All of the 
group (100%) stated before taking the ex- 
amination that PI was more effective than 
conventional instruction, and 93% also found 
it less difficult. None of the trainees found 
PI more difficult than the present instruction 
method. 

The Student Questionnaire also provided 
the trainees with several open-ended questions 


TABLE 4 
SUMMARY OF STUDENT QUESTIONNAIRE RESPONSES FOR EXPERIMENTAL CLASSES 
(n = 70) 
Scale Category* 
Compa lar classroom PI PI PI 
poca pui company Much PI PI Somewhat Much 
courses you have taken: Less Less Same More More 
1. How do you like the programed 
instruction (PI) method? 3% 3% 7% 24% 63% 
2. How difficult was it to learn . 
using the programed instruction . 
(PI) method? 62 31 7 
3. How much home study does the 
Programed instruction (PI) 
method require? 24 36 24 13 3 
4. How well has the programed in- 
Struction (PI) method taught - i 
You the material covered? 
Strongl Some Don't Some Strongl 
Object. Objections Care Preference Prefer 
Š 
* In future company courses you 
May take, would you like to see 
the programed instruction (PI) 
Method used in place of the B P " " 


regular classroom method? 


a five-point descrip! 


oR, - 
Pr, " For each question, the form had 
` have not been completel: 


9 save space, these statements 


tive scale containing very unfavorable to very favorable statements about 
y reproduced here. 


230 


asking what they particularly liked or disliked 
about PI. In their responses, 69 of the 70 
trainees mentioned some aspect of PI which 
they liked. A content analysis indicated that 
the most frequently liked aspects of PI were 
its effectiveness as an instruction method (46 
comments); certain characteristics of the 
method itself, such as the repetition of im- 
portant points, the gradual and logical se- 
quence of presentation, the way it maintained 
the student's attention and concentration (23 
comments); and the ability to proceed at 
one's individual rate (10 comments). It ap- 
peared from these comments that, through 
their own experience with PI, the trainees 
themselves recognized a number of the ad- 
vantages usually ascribed to it. 

In response to the question on what they 
particularly disliked, 40 of the 70 trainees 
wrote in a number of comments, but no single 
comment was made by many individuals. For 
example, there was criticism of the need to 
turn pages constantly (8 comments), the 
amount of repetition and written responses 
required (6 comments), the amount of time 
allotted for studying the materials (7 com- 
ments), and the absence of an instructor and 
class discussion (5 comments). Another criti- 
cism made by 7 trainees was the failure of 
the PI textbook used in this experiment to 
provide adequate summaries or outlines of the 
topics covered to aid in reviewing the material. 

Forty-nine trainees responded to another 
question asking for additional comments, but 
most of these remarks merely amplified the 
positive or negative comments reported above, 
Of most interest were the 14 comments rec- 
ommending the use of PI in other courses. Of 
these, however, 8 trainees qualified this rec- 
ommendation by stating that PI should not 
be used for extended periods without some 
type of instructor contact or classroom dis- 
cussion, 

Discussion 


The results of this exp 


i 1 eriment using PI in 
an industrial training sit 


uation corroborated 


1960). They indicated 


ig time and the improvement in learning 


J. L. Hughes and W. J. McNamara 


achievement possible through the use of PI 
in industrial training. 

These findings suggested several applica- 
tions to industrial training programs which 
promise important economies. One is the re- 
duction in the number of days that employees 
need to spend at central company training 
centers learning a given course. This reduction 
can be translated immediately into savings in 
the direct daily living expenses and salaries of 
these trainees and eventually into reductions 
of other educational and administrative costs. 

A second application is the possibility of 
greater decentralization of training by en- 
abling employees to be trained in basic courses 
at local field offices or other locations rather 
than in a central company training center. 
Since the trainee works individually on PI 
materials, an educational package can be pre 
pared for distribution to these field locations. 
The possible economies in this method of 
struction can be easily seen. 

In addition to savings in training costs — 
other promising result of PI is the possi 
of better trained employees. At present 
appears to be no reason why PI cannot. 
plied to substantial portions of tec _ 
manufacturing, clerical, sales, and m 
ment training courses now given to COly-- 
employees. Although the effect of bette 
trained personnel cannot always be measure” 
directly, it is obviously a major factor in 1m 
proving industrial efficiency. -— 

Some important qualifications regarding d 
use of PI for industrial training are suggeste h 
however, by the analysis of trainee repon 
to the Student Questionnaire (Table ^/* 
While these responses were generally. Me 
favorable, the write-in comments on par 
larly disliked aspects of PI suggested a nio 
ber of areas where potential trainee di 
faction could impair the effectiveness v 
training program using PI. These comme 
concerned the frequency of page turning, e 
boredom of too much repetition and writing ni 
Of responses, and the feeling of not ee 
enough time to go through the program dé 
textbook. Although these comments were m2 
relatively infrequently in this experiment, 
areas mentioned must be kept in mind 


t T t * 1 ng 
planning to use PI in industrial trainiD? | 


Programed Insiruction in Industry 


Fortunately, much of the page turning can 
be eliminated by improvements already un- 
der way in programed text format, rote repe- 
tition can be minimized by the preparation 
of more stimulating programs, and reasonable 
time limits can be determined by preliminary 
tryouts of programed materials. 

"Further trainee dissatisfaction with PI 
could arise from failure to integrate it prop- 
erly with other instructional techniques. It 
must be remembered that PI is not a panacea 
for all training ills. While 87% of the trainees 
in this experiment expressed a liking for PI, 
thereswere comments that too much PI with- 
out breaks for class discussion, laboratory, or 
other instructor contact at intervals would, in 
their opinion, become boring. Anyone con- 
cerned with using PI in industrial training 
must therefore carefully plan how to use it in 
limited amounts to supplement existing edu- 
cational procedures rather. than to replace 

m completely. It is anticipated that future 
arch in PI will furnish suggestions on how 


may best be done. 


SUMMARY 


Programed textbooks containing 719 frames 
were prepared covering the introductory 15 
hours of a 16-week course for trainees in a 
7070 Data Processing System servicing course. 
Achievement test scores for six experimental 
classes (7 = 70) who used these programed 
texts were compared with those of two control 
classes (2 = 42) taught by the lecture-discus- 
sion method. Significant gains in achievement 
and reduction of training time were found for 
the experimental classes. Student reaction to 
programed instruction as measured by a ques- 
tionnaire was found to be favorable. 


REFERENCES 


Epwanps, A. L. Experimental design in psychological 
research. New York: Rinehart, 1950. 

LUMSDAINE, A. A, & Graser, R. Teaching machines 
and programmed learning. Washington, D. C.: Na- 
tional Educational Association, 1960. 

Warrer, H. M., & Lev, J. Statistical inference. New 
York: Holt, 1953. 


(Early publication received May 16, 1961) 


l of Applied Psychology 
eral 45, No. 4, 232-236 


PERCEIVED TRAIT REQUIREMENTS IN BOTTOM 
AND MIDDLE MANAGEMENT JOBS' 


LYMAN W. PORTER 


University of California, Berkeley 


The importance of job perceptions in the 
management hierarchy was discussed in a 
previous paper (Porter, 1961), and results 
were presented on differences in amount and 
type of psychological need satisfactions per- 
ceived as being received in bottom and middle 
management jobs. The present study is con- 
cerned with a different aspect of job percep- 
tions, namely, personality traits that are 
perceived to be important for success in 
particular management jobs. Knowledge of 
such perceptions should be relevant for 
understanding the factors that affect indi- 
viduals’ motivation and performance on the 
job. Ultimately, top management decisions 
with regard to the promotion and placement 
of individuals in lower management positions 
Should benefit from this type of information. 

Self-descriptions of individuals in various 
levels of management, obtained by means of 
forced-choice, paired-adjective checklists, have 
been examined previously in a series of stud- 
les (Porter, 1958, 1959; Porter & Ghiselli, 
1957). The results of these self-description 
studies have shown certain differences be- 
tween levels of management that may in 
Dart reflect differences both in frequency 
of personality types found in the various 
levels, and in role demands of these levels. 
The present Study was designed to produce 
more direct evidence on role demands of 
management jobs as seen by the job incum- 
bents themselves. In the previous self-descrip- 
tion or self-perception studies, the respondent 
was asked merely to describe himself—no 
attention was called to any of his particular 


E 
1 This study was begun 
program of the Institute o 
the University of Californi: 
tinued under a fellowshi 
Foundation. 

The assistance of Mil 
Andrews is gratefully 
Social Sciences, 
contributed to th 


as part of the research 
f Industrial Relations at 
a, Berkeley. It was con- 
P granted by the Ford 


dred M. Henry and Robert 
acknowledged. The Institute of 
University of California, Berkeley, 
€ support of the research assistance. 


roles (e.g., male, father, plant superintendent, 
young adult, foreman, etc.). In the present 
investigation, the individual's specific mai- 
agement job is the basis on which he makes 
his judgments regarding trait requirements. 
As in previous studies, the respondent does 
not know the types of categories or classes 
of jobs that are under investigation, or the 
comparisons to be made (e.g., bottom vs. 
middle management jobs, staff vs. line i: 
etc.), and therefore systematic propi 
perceptions by categories or groups of indi- 
viduals are unlikely. : 

The present study specifically investigated 
differences between bottom level managers 
and middle level managers in the perception 
of the relative importance of 13 personality 
traits for success in their respective manapa 
ment jobs. The study also eg mn = 
relative perceived importance of the n 
within each of the two management levels. 


METHOD 
Questionnaire 


The data presented in this study were cm 
Írom one section of a three-part rieres 1 
section consisted of 13 personality traits (see 5 oa 
for a list of these traits) arranged in d pe 
choice pairs. Each trait was paired once wi tp sai 
other trait so that the 78 pairs constituted a cor P ern 
paired-comparison matrix. The respondents 
instructed as follows: : icturé 

“The purpose (of this part) is to obtain a pes 
of the traits you believe would best. qualify a PA HO 
for your present management position. TIME check 
right or wrong answers. In each pair of words, t foe 
the one you think is relatively more importan k. 
success in your present management posito 
though specific words will be repeated, no ibice 
words will be duplicated. Make each oa E 
Separate and independent judgment, and 
omit any pair.” 


Sample have 
av 
Details concerning the sample of saga 
been presented elsewhere (Porter, 1961). T diro uals 
Consisted of 64 bottom management in “\niddle 
(first-level supervisors and foremen) and 76 


232 


Trait Requirements in Management Jobs 


management personnel (above first-level supervisors 
but below  vice-presidents or major department 
heads). The sample was obtained from three compa- 
nies, called Companies A, B, and C in this paper 
and in the previous study that utilized the same 
sample. Company A is a large, nationwide manu- 
facturing organization, of which one plant was 
sampled for this study. Company B, also a nation- 
wide concern, produces and distributes a food 
product, with a relatively large number oí its jobs 
connected with the selling and distributing functions; 
one geographical division of this company provided 
respondents for the study. Company C is a medium- 
sized utility firm, with the respondents from this 
company coming from two of its divisions. Previ- 
ously published details of the sample (Porter, 1961) 
showed that the bottom and middle management 
groups of respondents were similar in median age 
and seniority. The middle management group had a 
somewhat greater proportion of individuals who had 
had some education beyond high school. 


Procedure 

All questionnaires were distributed individually to 
the respondents, through either United States or 
company mail. Each questionnaire was accompanied 
by a company memorandum requesting the indi- 
vidual's cooperation in the university-sponsored re- 
search project. The potential respondent was in- 
formed in this memorandum that his replies would 
be held confidential and that no individual responses 
would be made available to the company. Respond- 
ents returned completed questionnaires in prestamped 


self-addressed envelopes to the investigator. 


RESULTS 


The results of this study are presented in 
three tables. Table 1 compares bottom level 
Management individuals from all three com- 


Panies combined, with middle management 


individuals from all three companies. Table 2 


compares all individuals in each company, re- 
Bardless of management level, with those in 
€ach of the other two companies. Table 3 
Presents a breakdown of results for the two 
Management levels within each of the three 
Companies, In all three tables, ranks and 
Nean scores are presented for each of the 13 
Personality iraits. A mean score is based upon 
© number of times the trait is selected in its 
Comparisons with the other traits; thus, 
€an scores can vary from 0-12, with an 
°Ver-all mean of 6 for the total group of 13 
raits, 
SS 
n author wishes to thai 
a participate in this 
Personnel who sup] 


nk the companies that 
study, and their man- 
plied the basic data. 


ag, 
ag 


TABLE 1 
MEAN Scores AND RANKS FOR TRAITS 


BY MANAGEMENT LEVELS 


Bottom Middle 
Management Management 
(V = 64) (V = 76) 
Mean Mean 
Trait Score Rank Score Rank 
Aggressive 8 6.34 7 
Conforming 11 3.21 11 
Cooperative 1 9.13 1 
Dominant 13 1.42 13 
Energetic 3 7 4 
Flexible 5 7.08 5 
Independent 12 2.53 12 
Intelligent 2 9.08 2 
Original 10 6.04 9 
Persevering 7 6.47 6 
Poised 9 5.46 10 
Self-Controlled 4 7.68 3 
Sociable 6 6.11 8 


Table 1 shows that there was a very high 
correlation (rho — .97) between the ranks 
(and mean scores) of the traits as selected by 
bottom management and those as selected by 
middle management. 

The other important result emerging from 
Table 1 was the relative position of adjectives 
indicating cooperativeness and a willingness 
to adjust to other individuals (Conforming, 
Cooperative, Flexible, and Sociable), in com- 
parison with those indicating independence 
and individuality (Agressive, Dominant, In- 
dependent, and Original). Within both man- 
agement groups the cooperative-type adjec- 
tives were on the average considerably higher 
ranked in perceived importance to success on 
the job than were the items depicting “rugged 
individualism." As can be seen in Table 1, the 
trait of Cooperative even outranked and out- 
scored Intelligent; this was true for both 
levels of management, but it was especially so 
for individuals in bottom management jobs. 

In Table 2, where the data are presented 
by companies rather than by management 
levels, there were again high rank-order cor- 
relations of traits from company to company. 
However, on two items there were major 
shifts in ranks from company to company. 
For the Aggressive trait, a rank of fourth in 


234 


importance was obtained from Company B, a 
company in which a number of the managers 
in the obtained sample had sales supervisory 
duties. In Company A, the manufacturing 
concern, Aggressive had a middle rank of 7, 
and in the utility company it had a relatively 
low rank of 10. The other item that fluctuated 
from company to company was Self-Con- 
trolled, which had a high rank with Com- 
panies A and C, and only a middle rank with 
Company B. The variations in the placement 
of these two items were in accordance with 
expectations based on other knowledge of the 
companies involved. As previously mentioned, 
Company B had a number of sales managerial 
positions in its sample, and therefore it was 
not surprising that individuals in that com- 
pany stressed Aggressive as relatively impor- 
tant and Self-Controlled as only moderately 
essential. Company C, on the other hand, is a 
utility company with a more traditional for- 
mal organization and operating as a noncom- 
petitive economic enterprise. The individuals 
from Company C tended to place Aggressive 
considerably lower than did those in Com- 
pany B, and they also put relatively more 
emphasis on Self-Controlled. Company A man- 
agers, working in a manufacturing organiza- 
tion in a competitive industry, but operating 
under a traditional formal organization setup 


TABLE 2 


MEAN Scores AND RANKS FOR TRAITS, 
BY COMPANIES 


Company A Company B Company C 


(N =40) (N = 53) (N = 47) 
Mean Mean Mean 
"Trait Score Rank Score Rank Score Rank 

Aggressive 6.42 7 7.23 4 504 10 
Conforming 3.78 11 355 11 443 11 
Cooperative 9.95 1 8.64 2 9.87 1 
Dominant 1.10 13 187 13 1.60 13 
Energetic 740 4 794 3 6.66 5 
Flexible 7.08 5 6.75 5 6.79 4 
Independent — 2.28 12 213 12 268 12 
Intelligent 8.38 3 902 1 949 2 
Original 5.45 9 628 8 521 9 
Persevering 6.20 8 649 6 647 6 
Poised 495 10 577 10 5.51 8 
Self-Controlled 8.42 2 634 LESTE 3 
Sociable 660 6 598 9 638 7 


Lyman W. Porter 


with rigid lines of authority and responsibil- 
ity, might be expected to have job perceptions 
somewhat in between those of Company B 
and Company C managers on the items 
Aggressive and Self-Controlled. This in fact 
was true for Aggressive, but not for Self- 
Controlled, which was ranked quite similarly 
by Company A and Company C managers. 
(Company A might also have been more 
similar to Company C on Aggressive had the 
N for the first-level managers in the former 
company been as large as that in the latter.) 
Thus, although Companies A and B both 
operate in competitive private industry, Com- 
pany A tends to be more similar to a non- 
competitive organization, Company C. This 
suggests that immediate job functions and the 
iype of formal organizational structure are 
the more crucial factors in the determination 
of the types of job perceptions studied here. 

The shifts in ranks for the two items of 
Aggressive and Self-Controlled in Table 2 uo 
only fit expectations based on knowledge o 
company organizational setups and Mer, 
but they also demonstrate that ttie e 
questionnaire instrument used in this study 
was sensitive enough to pick up differences s. 
perceptions between different populations 0 
respondents. d 

Again, in Table 2, as in Table 1, the co 
operative-type traits were higher ranked E. 
the average in all three companies than piu 
the traits concerned with individuality am 
independence. 

Table 3, which presents a breakdown. Of 
the data by the two management levels wit x 
each of the three companies, shows that t 
two levels were generally quite similar to M 
other within each of the three different ve E 
ganizations. However, Table 3 yes ax 
interesting finding that was not clearly dca 
parent in Tables 1 and 2. If the four Z 
operative" adjectives are compared on "iles 
Scores with the four "independent" adjec 
as described earlier in this section, it psc 5 
seen that in Companies A and C there ae 
fairly consistent trend for middle mune g 
ment individuals to have perceived the aii 
operative” traits as relatively less impor de 
than did the bottom management individu? i 
and to have perceived the “individualis? 
traits as relatively more important. (I2 


Trait Requirements in Management Jobs 235 
TABLE 3 
MEAN Scores AND RANKS FOR TRAITS, By 1 GEMENT LEVELS WITHIN COMPANIES 
Company A Company B Company C 
Bottom Middle Bottom Middle Bottom Middle 
Management Management Management Management Management Management 
(V = 16) (N = 24) (N = 26) (N = 27) (N = 22) (N — 25) 
Mean Mean Mean Mean Mean Mean 
Trait Score Rank Score Rank Score Rank Score Rank Score Rank Score Rank 
Aggressive 6.00 7 6.71 6 7.23 4 722 4 5.05 95 504 10 
Conforming 5.56 9 2.58 12 3.96 11 315 11 5.05 95 3.88 11 
Cooperative 10.69 1 9.46 1 881 1 848 2 10.27 1 9.52 1 
Dominant 0.88 13 1.25 13 223 13 152 13 133 13 1.48 13 
Energetic 6.81 5 7.19 4 $00 3 789 3 6.68 5 6.64 5 
Flexible 6.25 6 7.62 5 658 6 6.93 5 6.86 4 6.72 4 
Independent 1.69 12 2.67 11 235 12 193 12 2.27 12 3.04 12 
Intelligent. 8.69 2.5 847 3 8.46 2 9.56 1 9.55 2 9.44 2 
Original 4.38 11 6.17 8 6.15 7 641 7 486 11 5.52 9 
Persevering 5.81 8 6.46 7 662 5 637 8 6.32 7 6.60 6 
Poised 5.12 10 4.83 10 5.65 10 5.899 10 5.41 8 5.60 8 
Self-Controlled 8.69 25 8.25 2 592 9 674 6 1595 3 8.16 3 
Sociable 7.44 4 6.04 9 6.04 8 5.93 9 6.41 6 6.36 7 


of these companies, nevertheless, the “coop- 
erative” adjectives received definitely higher 
values than did the “individualistic” adjec- 
tives.) Since the groupings of these two sets 
of adjectives were accomplished post hoc 
rather than prior to the experiment, no mean- 
ingful statistical treatment can be applied to 
this apparent trend, and thus it remains sug- 
gestive rather than proven. It should also be 
Noted that in Company B, which emphasizes 
Sales and distributive functions, the two man- 
agement levels were about equal in their rela- 
tive emphasis on "cooperative" and “inde- 


pendent" traits. 
DISCUSSION 


Two major findings emerge from this 
Study, The first is the fact that there was 
little difference between bottom level and 
Middle level managers in how they ranked the 

3 common personality traits in terms of per- 
Ceived importance for success in their respec- 
tive jobs. This finding applies to the relative 
Perceived importance among these 13 traits. 

the respondents had been asked to make 
absolute ratings of these traits, rather than to 
Elve them relative ranks via the forced-choice 
Method, it is possible that significant differ- 


ences between management levels might have 
been produced. Such differences in absolute 
perceived importance could not be determined 
from the present instrument. 

It is also possible, of course, that some of 
the similarity in rankings between the two 
management groups was due to general social 
or personal desirability differences among the 
traits, with such differences having little to 
do with specific job requirements. However, 
other evidence from the present study indi- 
cated that general social desirability could not 
entirely account for the obtained similarity: 
in Table 2, where results were presented by 
companies rather than by management levels, 
the ranks of some of the traits shifted rather 
widely from company to company in accord- 
ance with the probable psychological demands 
of managerial jobs in those companies. Thus, 
when the respondents were grouped on other 
bases than management levels, the obtained 
ranks seemed to reflect particular organiza- 
tional conditions and were not totally the re- 
sult of some general differences in social de- 
sirability of the items. 

The second major finding from this study 
involves the relatively high ranks obtained for 
the traits showing a concern for adapting to 


236 


the feelings and behavior of others—the co- 
operative-type items—compared with the rel- 
atively low ranks for traits showing a strong 
emphasis on personal and individual capabil- 
ities—the independent-type items. The former 
cooperative-adaptable items consistently were 
ranked higher than the latter individualistic- 
independent items, whether the analysis was 
by management level or by company. Again, 
this finding could be due to differences in the 
general social desirability of the particular 
items; however, in our present day industrial 
society where great emphasis has been placed 
on "individual worth," etc. items such as 
Independent and Original hardly seem less 
personally desirable for individuals than items 
such as Cooperative, and Flexible. The results 
of the present study would seem to suggest 
that the cooperative-type traits definitely are 
perceived as more important for success in 
lower and middle management positions in 
business than are independent-type traits. 
"There is, of Course, no evidence in the present 
study to determine whether the perceptions 
accurately represent reality. To the extent 
that they do, however, they raise a problem 
as to the fate of Original, Dominant, Inde- 
pendent individuals in lower managerial lev 
in large organizations, It would seem that 
either these individuals would have to con- 
form in their behavior to their perceptions of 
the type of person who gains success in their 
positions, or else they would probably have to 
forego as rapid advancement up the organ- 
izational ladder as individuals who fit (either 
naturally or by effort) the successful stereo- 
type. In either case, the organization would 
probably suffer the loss of some degree of 
originality and independence in its future top 
echelon executives. This would seem to be an 
undesirable occurrence from the organiza- 
tion’s point of view, if the statements of many 
presidents and other high-ranking corporation 
executives can be taken at face value concern- 
ing the necessity for managers to show initia- 
tive, self-reliance, and creativity in dealing 
with company Problems. In other words, it 
appears that many top-level executives may 
be advocating one type of behavior, but re- 
warding through a “law of effect” mechanism 


els 


Lyman W. Porter 


quite another type of behavior. If it can be 
assumed that the higher the individual is in 
the organization the greater such behavior as 
originality and independence is demanded by 
the job requirements, the question then be- 
comes one of how organizations insure that 
individuals who are best suited in these types 
of traits will be the ones that are likely to 
advance to top management positions. 


SUMMARY 


This study investigated the perception of 
the relative importance of various personality 
traits for success in management jobs. The 
perceptions of 64 individuals in bottom man- 
agement were compared with those of 75 
individuals in middle management jobs. The 
data were obtained by a questionnaire that 
consisted of 13 common personality traits 
arranged in 78 forced-choice pairs, where 
each trait was paired once with every other 
trait. The respondents were asked to check 
the one word in each pair that they thought 
was relatively more important for success in 
their particular management positions. The 
results showed the following: a high correla- 
tion between the trait rankings derived from 
the selections of the lower-level managers and 
those obtained from the middle-level man- 
agers; a high selection of traits indicating 
cooperativeness relative to traits indicating ine 
dependence, within both management levels; 
and a moderate trend for the cooperative-type 
traits to be perceived as relatively more im- 
portant for bottom management jobs than for 
middle management jobs. 


REFERENCES 
Porter, L. W. Differential self-perceptions of m 
agement personnel and line workers. J. 4 
Psychol., 1958, 42, 105-108. T 
Porter, L. W. Sdlt-perceptiotis of first-level eee 
visors compared with upper-management pk à 
and with operative line workers. J. appl. Psy 
1959, 43, 183-186. sfat 
PORTER, L. W. A study of perceived need Sor jobs. 
tions in bottom and middle managemen 
J. appl. Psychol., 1961, 45, 1-10. ena 
Porter, L. W., & Gnuseiu, E. E. The self Deed 
tions of top and middle management pers 
Personnel Psychol., 1957, 10, 397-406. 


(Received August 1, 1960) 


Journal of Applied Psychology 
1961, Vol. 45, No. 4, 237-239 


EFFECT OF SIMULATED APPLICANT STATUS 
ON KUDER FORM D OCCUPATIONAL 
INTEREST SCORES 


C. S. BRIDGMAN ax» G. P. HOLLENBECK 


Bureau of Industrial Psychology, University of Wisconsin 


Kuder (1950, 1957) has developed an 
interest inventory (Occupational, Form D) 
which provides scales for various occupations. 
He suggests in the manual (1956a, p. 5) that 
these scales and others which could be de- 
veloped for additional occupations might be 
of assistance in selection of industrial per- 
sonnel. 

Each Kuder scale is developed by selecting 
items which differentiate maximally between 
a group already employed in the given oc- 
cupation and a base group representing em- 
ployed people in general (Kuder, 1956b). 
Applicants presumably would be influenced 
by a desire to make a good impression to a 
greater extent than would individuals who 
are already established in the occupation. 
Therefore, use of such an inventory for 
Selection purposes raises the problem of re- 
Sponse bias. Kuder provides a verification 
Scale to help identify respondents who either 
have answered incorrectly or carelessly, or 
Who may have answered insincerely. His data 
indicate relatively . little overlap between 
Verification scores obtained when subjects re- 
Sponded sincerely and when they were in- 
Structed to make a good impression. For 
example, only 10% of a group of 50 college 
Students obtained verification scores im the 
"Acceptable" range, when instructed to con- 
ceal their faking while giving best impres- 
Sion” responses (Kuder, 1956a). 

The question still remains whether va 
Can be biased in the desired direction on such 
and if so, whether the 
in verification 
ler condi- 


PCcupational scales, 
‘as will be revealed by a shift 
Cores, when respondents answer unc d 
‘Ons more closely approximating the applica- 

^. where the set for faking 

icitly established than 
with the “best impres- 


-On situation, ie. 
* less firmly and expl 
si presumably the case im 
Set which Kuder used to demonstrate 

* effectiveness of the verification scale. 


METHOD 


To explore these and related questions, Kuder's 
Form D was administered to four groups of students 
in elementary psychology classes under instructions 
outlined below. Groups were asked to fill out the 
interest. inventory as they would if applying for a 
specific sales job (sanitary supply salesman), the 
job of industrial psychologist, and an unspecified 
"job in industry." The salesman job was described 
briefly for the first group, and the psychologist job 
was described briefly for the second group. In each 
case the subjects were told to assume that they had 
the necessary background and were interested in 
obtaining the position. However, no explicit instruc- 
tions were given to make a good impression or to 
falsify their responses.! For comparison purposes a 
fourth group was given vocational guidance instruc- 
tions, i.e., they were asked to complete the inventory 
accurately in order to obtain help in making a 
vocational choice. 

The answer choices of these four groups were 
scored on a sanitary supply salesman scale (de- 
veloped at the University of consin) and on the 
Kuder industrial psychologist scale. Verification scores 
were also obtained. 

By comparisons among scores of the experimental 
groups and available reference groups, we have been 
able to consider the following questions: 


1. Can students approximate the responses of indi- 
viduals actually employed in the occupations under 
consideration, after instructions to respond as though 
thev were interested applicants? 

2. Are the effects of assuming the role oí job 
applicant specific to the particular occupation, or are 
these effects the result of a generalized effort to 
"look good"? 

3. Do the unbiased scores of student groups differ 
irom those of Kuder's base group on these occupa- 
tional scales? 

4. Can Kuder's verification scale differentiate be- 
tween biased and unbiased groups when the bias set 
is established through instructions to assume the role 
of job applicant ? 


RESULTS AND Discussion 


The mean scores for groups instructed to 
act as applicants for sales and for psychologist 
did not differ significantly from the mean 


jobs 


1 Copies of the detailed instructions can be 
obtained from the Bureau of Industrial Psy chology, 


238 


TABLE 1 
MEAN OCCUPATIONAL INTEREST SCORES AND STANDARD 
DEVIATIONS FOR GROUPS GIVEN Speciric Jor 
INSTRUCTIONS AND FOR THE CORRESPONDING 
OCCUPATIONAL GROUPS 


Scale Group N Mean SD 
Sales instructions 70 71.5 10.5 

Salesman 
Actual salesmen 50 73.1 9.6 
Industrial Psychologist instructions 50 53.25 7.9 
Psychológ i d iyetiologlats 200 54.5 97 


a Mean not significantly different from mean of corresponding 
occupational group. 


Scores for the corresponding occupational 
groups (Table 1).? The students, simply by 
assuming an applicant set based on a brief 
job description, obtained distributions of 
Scores comparable to those of individuals 
actually employed in the two occupations. 

Among the experimental groups, the highest 
Occupational interest scores on each key were 
obtained by the group given specific instruc- 
tions appropriate to the key (Table 2). Non- 
Specific instructions to apply for a job in 
industry were significantly less effective than 
the specific instructions appropriate to the 
key, but did produce mean scores significantly 
higher than vocational guidance (sincere) 
instructions, 

It might be argued that presentation of the 
salesman and psychologist job descriptions 
merely established more firmly a generalized 
set to look good, in comparison to the job-in- 
industry instructions, and that this could 
account for the higher scores. However, evi- 


dence for the specificity of the effects of the - 


Specific job instructions was found when the 
Eroups given these instructions were scored 
on the noncorresponding scale (i.e., salesman 
instruction group scored on psychologist 
key, and psychologist instructions group 
Scored on salesman key), as shown in the 
appropriate cells of Table 2. Salesman in- 
Structions were only as eff 


MER ective as nonspecific 
OD Instructions in i i 

J ns in increasing scores on the 
? Analyses of 


variance indicated 


i significant 
(p < .01) differences betwee; 


Significant differences 
indicated in the text and tables. 


C. S. Bridgman and G. P. Hollenbeck 


psychologist scale. On the other hand, psy- 
chologist instructions did not produce an in- 
crease in scores on the salesman scale above 
the mean obtained by the vocational guidance 
group. i 

It may be concluded that the specific job 
instructions produce answer choices appropri- 
ate to the given occupation. The resulting 
scores cannot be attributed simply to a more 
effectively established general set to make a 
good impression. : 

The vocational guidance group did ur 
differ significantly from a sample of Kuder's 
representative employed group when both 
were scored on the sales interest scale? (Data 
for Kuder's group: Ņ = 97, mean — 60.7, 
standard deviation — 9.9. Data for our group 
is in Table 2.) However, the industrial psy- 
chologist scores of these two groups differed 
markedly. The mean of the students ae 
fell approximately halfway between that E. 
the representative employed group (33.2) an 
that of the actual industrial psychologists 
(54.5). (Data for Kuder’s group: N = 97, 
mean — 33.2, standard deviation — 9.8. Data 
for the other two groups can be found in 
Tables 1 and 2.) ae 3 

It is not particularly surprising that, in 
their unbiased responses to this inventory, 
our students are more like industrial psy- 
chologists than is Kuder's representative an 
ployed group. However, this finding taisen au 
important question as to the appropriate bas 
group which should be used in developing 
these specialized occupational interest scales. 


e 
3 The employed group is a sample of 97 Me 
answer sheets were obtained from Science Rese 
Associates, 


TABLE 2 


" ANDARD 
MEAN OCCUPATIONAL INTEREST SCORES AND STAND. 
DEVIATIONS FOR EXPERIMENTAL GROUPS 


Scoring Key 


gist 
Salesmaı. | Psychologis 
Made pL. = 


Type of Set Instructions N Mean SD Mean = 
ae Salesman 70 71.58 105 48.7 92 
st Psychologist 50 614 82 S32 7.9 
Nonspeeinc Jobin industry 55 648€ 69 489 9! 
Sincere adorat 100 61.8 9.1 44.68 8? 


three 
mean significantly different from the means of the 
remaining groups, 


Applicant Status and Interest Scores 


TABLE 3 
VERIFICATION MEAN ScoRES AND STANDARD 
DEVIATIONS OF EXPERIMENTAL AND 
COMPARISON GROUPS 


N Mean SD 


70 46.80 55 
Instruction 50 4 b 46 
Group Job in f 55 489» 49 
Vocational guidance 100 511b 45 

Comparison — Kuder's representative 
Group employed 97 S40 38 


a Significantly different from vocational guidance mean. 
b Significantly different from representative employed mean. 


If the scale is to be used to discriminate 
among applicants or potential applicants, 
then samples of applicants can be suggested 
as appropriate for use as a base in developing 
the occupational scale. At least this procedure 
should be considered when there is any reason 
for suspecting that the responses of the appli- 
cants will differ systematically from Kuder’s 
representative employed group. 

The verification mean scores of the bias 
groups were significantly lower than that ob- 
tained by the vocational guidance group, 
except in the case of job in industry set, 
as shown in Table 3. (The scale is designed 
to give lower scores for individuals who are 
trying to make a good impression.) However, 
the observed means for our applicant set 
groups are not nearly as low as those found 
by Kuder for best impression responses, since 
he has reported mean scores in the vicinity 
of 35 for a number of groups responding 
under this set. Another Kuder group, in- 
Structed to conceal its bias, obtained a mean 
Of 41 on the verification scale (Kuder, 
19564). The overlap of the distributions of 
the sample of Kuder's representative em- 
Ployed group and his group instructed to 
Conceal its bias was 16%. However, the over- 
laps between the vocational guidance instruc- 
tions group and the salesman, psychologist, 
and “job in industry” instructions groups were, 
respectively, 67%, 54%, and 82%. Thus it 
must be concluded that effective bias of oc- 
Cupational scores can be achieved without a 
Sufficiently large shift in verification scores 
lo ensure identification of the presence of bias 
n individua] cases. The possibility remains 

at more discriminative modified verification 
Scales could be developed for use with each 


Scupational scale. 


239 


SUMMARY 


Separate groups of college students were 
instructed to assume they were applying for 
specific jobs (sanitary supply salesman and 
industrial psychologist). When scored on the 
appropriate scale of Kuder’s interest inven- 
tory (Form D), both groups obtained scores 
comparable to the groups employed in these 
occupations. 

College students instructed to apply for an 
unspecified “job in industry” showed signifi- 
cantly higher means on both scales than a 
group given vocational guidance instructions, 
indicating that some part of the bias noted 
above can be introduced by such a general 
nonspecific set. However, evidence was pre- 
sented that the instructions to apply for spe- 
cific jobs produced responses appropriate to 
the specified occcupation, rather than simply 
inducing a more effective nonspecific set. 

College students given vocational guidance 
instructions obtained scores comparable to 
the base group (representing employed people 
in general) on the sales interest scale, but 
scored significantly above the base group on 
the industrial psychologist scale. This result 
was interpreted as implying the need to use 
a base group comparable to the applicant 
group when it is the purpose of the investi- 
gator to develop Kuder-type interest scales 
to be used for selection purposes. 

Kuder’s verification scale differentiated 
significantly between the groups responding 
with an applicant set and the vocational 
guidance group. However, the differentiation 
was not nearly as effective as reported by 
Kuder between sincere and best impression 
groups. The differentiation was not sufficient 
to warrant use of the verification scale in the 
manner recommended by Kuder. 


REFERENCES 


Kuper, G. F. Identifying the faker. Personnel Psy- 
chol., 1950, 3, 156-167. 

Kuper, G. F. Preference Record, Occupational Form 
D, manual. Chicago: Science Research Associates, 
1956. (a) 

Kuper, G. F. Preference Record Occupational Form 
D, research handbook. Chicago: Science Research 
Associates, 1956. (b) 

Kuper, G. F. A comparative study of some methods 
of developing occupational keys. Educ. psychol, 
Measmt., 1957, 17, 105-114. 


(Received August 1, 1960) 


al of Applied Psychology 
ir v 45, No. 4, 240-243 


A SIX-YEAR FOLLOW-UP STUDY OF GRADUATE 
STUDENTS IN PUBLIC HEALTH EDUCATION 


LEONARD D. GOODSTEIN ax» BARBARA A. KIRK ! 


University of California Counseling Center, Berkeley 


The past few years have seen the large- 
scale usage, on both the experimental and op- 
erational levels, of many psychometric de- 
vices for selecting students for a variety of 
professional training programs. Most of the 
experimental programs have tended to con- 
centrate upon academic success, as measured 
by grades or other rating procedures, as the 
principal criterion for empirically evaluating 
the usefulness of these selection procedures. 

Since professional success is virtually al- 
Ways contingent upon the completion of an 
academic training program, this concentration 
upon academic success is obviously useful and 
important. But the relationships between aca- 
demic success and later professional success, 
beyond the direct effects of the pass-fail di- 
chotomy in completing training, are mainly 
ignored. Since all Professions seem to circu- 
late anecdotal Teports of the mediocre student 
Who makes a significant professional contribu- 
tion, as well as of the brilliant student who 
sinks into obscurity, it is probable that the 
relationships between academic and profes- 
sional success are neither simple nor direct. 

The difficulties in tracing the careers of 
professional persons, especially the difficulties 
in evaluating relative success ina profession, 
have resulted in a paucity of follow-up studies 
and, consequently, very little is known either 
about the relationships of academic and pro- 
fessional success or about the usefulness of 
tests in predicting later professional success. 
The present paper is an attempt to add to our 


1 This research was completed while the senior au- 
thor, on leave from the University of Iowa, w; 
ing as a Research Consult 
California Counseling 
to the Univer: ity of Iowa, Iowa City. 

The authors arc indebted to. Roger 
for his assistance with the statisti 
data, William Griffiths, Dorothy B. 1 yswander, and 
Beryl Roberts of the University of California School 
of Publice Health, Berkeley, were of invaluable as- 
sistance in securing the test protocols and in provid- 
ing some of the rankings. 


as serv- 
ant to the University of 
Center. He has now returned 


W. Cummings 
analysis of the 


meager knowledge about these problems by 
studying such relationships in one profession, 
that of the public health educator. 

Barthol and Kirk (1956) have reported 
upon the successful development of a psycho- 
metric battery for selecting graduate students 
in public health education, with faculty rank- 
ings of academic progress used as thc. cri- 
terion. The students with the best backgiound 
(combined years of prior work experience 
and/or prior academic training in public 
health), with the highest previous achieve- 
ment in public health (scores above the norm 
group mean on the American Public Health 
Association—APHA—Examination), and with 
the highest levels of mental ability (scores 
above 100 on the Concept Mastery Test— 
Terman, 1956) were rated as the best stu- 
dents in both the classroom and in field work 
placements. Those students whose measured 
interests were in working with people (high 
Strong Vocational Interest Blank—SVIB— 
scores in the Group V, welfare, and Group X, 
verbal-linguistic, occupations) and those stu- 
dents who were relatively free from seen 
ality disturbances (no score above 70 on the 
Minnesota Multiphasic Personality Inventory 
—MMPI—except on the Mf scale) were also 
rated as the better students but these latter 
results were less statistically reliable. The pur- 
pose of the present study is to investigate pé 
relationships between these test indices o 
academic success and later on-the-job success 
as a public health educator. 


METHOD 
Subjects 


. rthol 
Of the 20 students in the first class of the Barth 


were 
` and Kirk (1956) study, 19 had graduated and Y 


used as the Ss in the present study. All hs eyes 
ceived the MPH degree from the University of Wa 
fornia and had been out of school for 6 years oe 
lime this follow-up study was initiated. The iore] 
eth student in the original study had contrac 
polio and had not completed the program. 


240 


Graduate Students in Public Health Education 


Procedurc 


The professional work history during the 6 years 
following graduation was obtained for each oí the 
19 Ss from the records of the School of Public 
Health. In addition, each graduate received a written 
request from a member of the school faculty to again 
take the SVIB and MMPI. Since all 19 Ss complied 
with this request, there are two sets of SVIBs and 
MMPIs available, one obtained at the time of en- 
try into the school and one obtained 6 years after 
graduation. 

Each of the four sets of profiles with the Ss’ names 
removed was separately and independently ranked 
along a continuum of “potential success as a public 
health educator" by two counseling psychologists 
with considerable experience in the use of these in- 
struments. The reliability of these ratings was shown 
by the high interjudge agreement (rho's from .76 to 
.82) between the raters. 

Seven trained counseling psychologists, all of them 
having some familiarity with the field of public 
health education, then independently ranked the 
anonymous 19 work histories along a continuum of 
“over-all success in public health education." The 
subsequently obtained mean rank for each individual 
was then used as the criterion measure of profes- 
sional success. The obtained rank-order correlations 
between each judge and the ranking of the summed 
ranks (minus that particular judge's ranking) ranged 
from .48 to .93 with a median of .84. These results 
Obtained between seven independent judgments sug- 
gest that the rankings are reliable enough to be used 
as the criterion measure. 

Two of the three faculty judges who had ranked 
academic success in the original study (grades were 
Dot used because of the restriction in range at the 
Eraduate level) also ranked the over-all success of 
these 19 Ss with a rank-order correlation of .87 be- 
tween their two rankings. It was decided not to use 
their current rankings as the criterion measure be- 
Cause these current rankings may have been con- 
laminated by the prior (1952) academic rankings. 
Nevertheless, the rank-order correlation between the 
Than ranking by the faculty judges and the cri- 
lerion ranking was .78, indicating considerable com- 


Mon variance. 
RESULTS 
The rank-order (rho) correlations between 
fach of the tests and the 1959 professional 
Success criterion are presented in Table 1. 
„€ rho’s previously reported by Barthol and 
Kirk (1956) between the tests and the 1952 
“Cademic success criterion are also presented 
™ Table 1 for the purpose of comparison. 
Table 1 indicates that those students with 
the best background in public health and 
Whose entry (1952) MMPI profiles were rated 
- Showing the best potential were ranked as 
Most professionally successful. Both of 


241 


TABLE 1 
RANK-ORDER CORRELATIONS OF Test DATA 
AND CRITERIA OF Success 


1959 
: On-the-job 
Predictor Variables N * Criterion 


APHA® 


ndard scores 
onsequently 


d Not previously reported by Barthol and Kirk because not 
then able. 
k 5. 


b «.05. 
** pel. 


these predictors were equally successful in 
predicting the 1952 academic success ranking, 
although these particular MMPI evaluations 
had not been previously examined. The scores 
on the APHA examination, the best predictor 
of academic success, do not significantly cor- 
relate with the professional success criteria. 
The SVIB global rankings, which also had not 
been previously examined but which are nega- 
tively and significantly correlated to the aca- 
demic criterion, are negatively but not sig- 
nificantly related to the professional success 
criterion. It is noteworthy that the criterion 
rankings of academic success by the three 
faculty judges correlated .70 with the criterion 
of professional success, making these aca- 
demic rankings the best predictors of later 
professional success. 

The rankings of the current (1959) MMPI 
and SVIB profiles are not significantly corre- 
lated with the professional success criterion. 
Both sets of mean MMPI profiles are rela- 
tively flat with the mean 7 scores on the nine 
clinical scales ranging from 49 to 61. While 
a comparison of the 1959 mean MMPI pro- 
file with its 1952 counterpart indicated virtu- 
ally no change except for a small, nonsignifi- 
cant drop on Hypochondriasis (Ms). an in- 
spection of the individual profiles suggested 
considerable change. In an effort to assess this 
change, the two mean MMPI profiles of the 
six most successful Ss were compared with 
those of the six least successful Ss (using 


242 Leonard D. Goodstein 
the 1959 criterion). The most successful Ss 
showed an increase over time in the mean 
score on every scale but Hs, with the largest 
increases on the Psychasthenic (Pt), Psycho- 
pathic deviate (Pd), and Depression (D) 
scales, while the least successful Ss showed a 
decrease over time in the mean score on every 
scale with the largest decreases on Hs, Schizo- 
phrenia (Sc), Pt, and Mania (Ma). These 
interactional differences are significant (5 
> .05), however, only in the case of the Hs, 
Pt, and Sc scales. Thus, although there are no 
differences between the over-all mean 1952 
and over-all mean 1950 MMPI profiles, it 
would appear that the most successful Ss 
Show a rise in their profiles while the least 
successful Ss show a decline in their profiles. 
This change is most striking in the case of the 
Pt scale where the most successful group in- 
creases an average of 8.8 T score units while 
the least successful group decreases an aver- 
age of 9.14 units, 

Both sets of SVIB profiles demonstrate a 
primary pattern in the Group V (welfare) oc- 
cupations with the highest scores on Public 
Administrator, Social Science High School 
"Teacher, Social Worker, and Minister keys. 
À comparison of both the two mean profiles , 
and the individual profiles does not suggest 
that any striking changes have occurred in 
the pattern of measured interests, There was 
some slight tendency for scores in the Group I 
(scientific) and Group II (technical) occupa- 
tions to decrease and for scores in all the other 
groups to increase in time with the largest in- 
creases in Group V (welfare) especially on 
the Minister key, and in Group VII, Certified 
Public Accountant. 

None of the specific SVIB indices, i.e., pri- 
mary patterns in the Group X (verbal-lin- 
guistic) occupations, Occupational Level (OL) 
Scores above 55, or Masculinity-Femininity 
(MF) scores for men below 49, which had 
been found to be related to the academic suc- 
cess criterion, were related to the later pro- 
fessional success criterion, either over the en- 
tire range of success or between the two ex- 
treme groups. 


DiscussioN 


The present study indicates 


3 5 once again that 
à selection testing 


Program, even one carefully 


and Barbara A. Kirk 


designed and logically selected, must be evalu- 
ated not only against the empirical criterion 
of training success but also against the em- 
pirical criterion of later on-the-job success. 
"Those graduate students in public health edu- 
cation who come with the most prior related 
academic and work experience (those with 
tested motivation and aptitude in the field) 
and who appear most personally stable are the 
most successful, both as students and as later 
professional public health workers. While the 
measured level of achievement in public health 
(APHA examination) prior to training is a 
predictor of academic success, it is not a pre- 
dictor of professional success. Perhaps the 
formal training program as well as the subse- 
quent on-the-job experiences diminishes the 
importance of this pretraining knowledge. A 

It is quite interesting to note that, in this 
study, academic success is the best predictor 
of professional success. However, since most 
professional placements, especially those made 
relatively early in professional careers, are 
largely dependent upon faculty recommenda- 
tions, this may not be a surprising finding. 
More data on the relationship of these two 
kinds of evaluation in this and in other pro- 
fessional fields are obviously necessary. 

A comparison of the professional work E 
perience of those six Ss ranked as the ud 
successful by the judges with that of the si 
ranked as least successful reveals quite iw. 
ent patterns of job activities. Those judge 
most successful showed a gradual increase i 
responsibility through the 6 years, Me 
either by assuming a supervisory title orap a 
sition of broader scope, e.g., one involving 
national rather than state program. The ton 
six Ss had held a total of 16 different ek. 
sional positions during the 6-year period w. P 
the bottom six held only eight different Fe 
sitions. Three of the top six were current 
students working toward a doctorate, "e há E 
a college professor, and one was involve de 
national program, while one of tha bottoni à 
was out of the field, none was involved in E 
ditional academic training, and none was p 
erating with national programs. is 

It is believed that the increase in stress cua 
companying these positions of greater m J 
sibility is the crucial factor in the greater ™ 


o 
adjustment evidenced in the retest MMPIS | 


Graduate Students in Public Health Education 


the top six Ss. These six individuals were gen- 
erally more tense and anxious, as well as more 
active, than they were as students. Contrari- 
wise, the better “adjustment” evidenced by 
the bottom six Ss may be seen as a conse- 
quence of their finding a level of occupational 
activity involving less stress and then working 
successfully at that level. 

The results with the SVIB indicate that 
public health education involves a primary 
interest in welfare occupations with a strong 
secondary interest in verbal-linguistic and 
business-contact occupations. Strong scientific 
and technical interests are apparently not 
necessary for success in this field despite the 
a priori assumptions to the contrary. The high 
weight implicitly given to scientific and tech- 
nical interests in the ranking of the SVIB pro- 
files, together with a failure to give sufficient 
weight to the verbal-linguistic and business- 
contact interests, is undoubtedly responsible 
for the negative relationships between the 
SVIB ranking and the success criteria. While 
these interest patterns are clearly understand- 
able in terms of a post facto analysis of the 
career patterns, they had not been anticipated. 


SUMMARY 


The purpose of the present study was to in- 
vestigate the relationship of certain test in- 


243 


dices of academic success in a graduate cur- 
riculum, previously reported by Barthol and 
Kirk (1956), and later, on-the-job success 
as a public health educator. Six years after 
graduation, those students who had the most 
background in public health prior to entering 
the program and who were rated as having 
the best adjusted MMPI profiles were rated 
as the most successful by seven counseling 
psychologists who had had some experience 
with the training program. The prior, inde- 
pendent ratings of academic success in the 
program by three faculty judges were, how- 
ever, the best predictors of professional suc- 
cess. Certain changes in both the MMPI and 
SVIB profiles, which were obtained both at 
the time of entry into training and again 6 
years after graduation, were compared to the 
criterion of success and were discussed in 
light of certain patterns in the career de- 
velopment of these 19 public health educators. 


REFERENCES 


Bartuot, R. P., & Kirk, BARBARA A. The selection 
of graduate students in public health education, 
J. appl. Psychol., 1956, 40, 159-163. 

Terman, L. M. Manual for the Concept Mastery 
Test. New York: Psychological Corporation, 1956. 


(Received August 9, 1960) 


l of Applied Psychology 
me d. ^45, No. 4, 244-247 


SELF-DISCLOSURE SCORES AND GRADES 
IN NURSING COLLEGE 


SIDNEY M. JOURARD 


Universi! 


There are educational programs in which 
the interpersonal as well as the intellectual 
aspects of students! performances are consid- 
ered in the final determination of grades. Col- 
legiate programs of nursing are a case in 
point; students of nursing are judged, partly 
on the basis of their ability to learn the ver- 
bal subject matter of the curriculum, but also 
on the degree to which they have mastered 
manual and interpersonal skills. The present 
study was undertaken to determine whether 
& measure of one aspect of nursing students’ 
interpersonal behavior—their self-disclosure to 
parents and peers—when obtained early in the 
students’ careers, would predict their grade- 
point averages at the conclusion of their pro- 
grams. 

METHOD 


Measurement of Self-Disclosure, Self-disclosure re- 
fers to the act of revealing personal inform: 
about the self. A questionnaire method for ass 
the extent to which subjects have 
categories of information to selected 
had previously been found to h 
and at least concurrent validity (cf. Jourard & Lasa- 
kow, 1958). Two additional studies with a much ab- 
breviated questionnaire (Jourard, 1959; Jourard & 
Landsman, 1960) showed that the instrument had 
some measure of predictive validity. For the present 
investigation, a questionnaire of intermediate length, 
listing 25 items of personal information (see Table 1), 
was prepared for administration according to the 
following instructions: 


ation 
essing 
revealed various 
“target-persons” 
ave both reliability 


Indicate on the speci: 
which certain other 
listed on this questio 
or confiding it to th 
Other person knows 


al answer-sheet the extent to 
people know the information 
nnaire through your telling it, 
em. If you are certain that the 
this information fully—so that 
he or she could tell Someone else about this aspect 
of you—write the number 7 in the appropriate 
space. If the other person does not know this in- 
formation fully, having only a vague idea, or in- 
complete knowledge, write in a zero. Remember do 
not write in a 7 unles you are sure that you have 
given this information to the other person in full 
enough detail that they could describe y 
rately in this respect to another person, 


ou accu- 


y of Florida 


The answer sheet was ruled with 25 rows, and 4 
columns headed, respectively, Mother, Father, closest 
Male Friend, and closest Female Friend. 

Previous unpublished study had shown that the 
odd-even reliability coefficients for the "target per- 
son” subtotals were in the .80s, and the odd-even 
reliability coefficient for the entire questionnaire was 
93. : 

Subjects and Procedure. The self-disclosure ques- 
tionnaire was administered to 46 sophomore students 
of the University of Florida College of Nursing dur- 
ing a regular classroom session. Median age of this 
roup was 20 years. g t 
: By the time this group had become seniors, gri 
tion had reduced the N to 23. Following the comp! 3 
tion of the senior academic year, grade-point AET 
ages of these 23 subjects were calculated for (a) * 
nursing courses taken during the 4 years of stupi 
(b) nursing courses taken in the junior and n 
years, (c) all nonnursing courses taken during ode 
4-year program, and (d) all courses combined. Pr s 
uct-moment correlations were calculated DEC 
these grade-point averages and the self-disclosu 
scores obtained 2 years earlier. 


RESULTS 


Table 2 shows the correlations between - 
various grade-point averages and the self-dis 
closure scores. Significant r’s were found be- 
tween total disclosure scores (see Column 5) 
and the three sets of grade-point averages E 
which nursing courses figured. The correlatio! 
between total self-disclosure score and oe 
for nonnursing courses did not reach i : 
cal significance. Table 2 also shows — 
tions between each of the target subtota 
and the various grade-point averages. It p^ 
be seen that the highest 7’s were found 
tween scores for disclosure to Mother p 
the three grade-point averages which "t 
cluded nursing courses. The 7’s between me 
closure to Female Friend and nursing ern 
were slightly lower, and those heiwaB PL 
closure to Father and the two sets of aree 
grades were lower still, but still within - 
range of statistical significance. The 75 Jes 
tween disclosure to Male Friend and "aee 
were all insignificant, as were those betwt 

244 


Self-Disclosure Scores and College Grades 


TABLE 


3 SELE-DiscLosurE 


1 
QUESTIONNAIRE 


1. What you like to do most in your spare time at 
home, e.g., read, sports, go out, etc. 


2. The kind of party or social gathering that you 

enjoy most 

3. Vour usual and favorite spare-time reading ma- 

terial, e.g., novels, nonfiction, science fiction, 
poetry, etc. 

4. The kinds of music that you enjoy listening to most, 

e.g., popular, classical, folk-music, opera 

5. The sports you engage in most, if any, e.g. golf, 

swimming, tennis, baseball, etc. 

6. Whether or not you know and play any card games, 

e.g., bridge, poker, gin rummy, etc. 

7. Whether or not you will drink alcoholic beverages; 
if so, your favorite drinks—beer, wine, gin, brandy, 
whisky, etc. 

. "The foods you like best, and the ways you like food 
prepared; e.g., rare steak, etc. 

9, Whether or not you belong to any church; if so, 

which one, and the usual frequency of attending 


Whether or not you belong to any club or frater- 


10. 
if so, the names of these 


nity, civic organizations; 
organizations 
11. Any skills you have mastered, e.g., arts and crafts, 
painting, sculpture, woodworking, auto repair, 
knitting, weaving, etc. 
Whether or not you have any favorite spectator 
sports; if so, what these are, e.g., boxing, wrestling, 
football, basketball, etc. 
. The places that you have traveled to, or lived in 


during your life—other countries, cities, states 


sentiments are at present— 


15. 


18. 


19. 


20. 


t 
Nv 


25. 


Whether or not you have been seriously in love 
during your life before this year; if so, with whom, 
what the details were, and the outcome 


. The names of the people in your life whose care and 


happiness you feel in some way directly responsible 
for 

The personal deficiencies that you would most like 
to improve, or that you are struggling to do some- 
thing about at present, e.g., appearance, lack of 
knowledge, loneliness, temper, etc. 

Whether or not you presently owe money; if so, 
how much, and to whom 

The kind of future you are aiming toward, working 
for, planning for—both personally and vocationally, 
e.g., marriage and family, professional status, etc. 
Whether or not you are now involved in any projects 
that you would not want to interrupt at present— 
either socially, personally, or in your work; what 
these projects are 

The details of your sex life up to the present time, 
including whether or not you have had, or are now 
having sexual relations, whether or not you mastur- 
bate, etc. 

Your problems and worries about your personality, 
that is, what you dislike most about yourself, any 
guilt, inferiority feelings, etc. 


23. How you feel about the appearance of your body— 


your looks, figure, weight—what you dislike and 
and what you accept in your appearance, and how 
you wish you might change your looks to improve 
them 


. Your thoughts about your health, including any 


problems, worries, or concerns that you might have 
at present 
An exact idea of your regular income (if a student, 


14. What your political ù t 
in i state and federal government policies of your usual combined allowance and earnings, if 
your views on St a 
: any) 
of interest to you 

TABLE 2 


LF-DISCLOSURE Scores AND GRADE-POINT AVERAGES IN NURSING COLLEGE 


CORRELATIONS BETWEEN SE 


Disclosure Scores 


Male Female Total 
Grade-Point Averages Mother Father Friend Friend Disclosure 
75+ * M .59** id 
All nursing courses pn T. EAT .62** Doom 
Junior and senior nursing courses x 29 —.05 .39 3o 
All nonnursing courses “ow 38 A4 S3** n 
All courses combined ei ` 


246 


the various target subtotals and grade-point 
average for nonnursing courses. 


Discussion 


The self-disclosure scores may be presumed 
to reflect the degree to which the subjects 
actually have engaged in self-revealing com- 
munication to significant others in their lives. 
The present findings strongly suggest that 
this type of activity prepares a nursing stu- 
dent to engage in the kinds of behavior which 
will earn her the highest grades in nursing 
college. It is interesting that the highest cor- 
relations were found between scores for dis- 
closure to Mother and grades in nursing 
courses. This finding implies that experience 
at communicating openly with one’s mother is 
good preparatory practice for communication 
with other female authority figures, viz., the 
faculty of the college of nursing. Support for 
this interpretation is provided by the fact that 
in the particular college from which the sub- 
jects were drawn, the students were required 
throughout the program to reveal their per- 
sonal reactions to books and articles they 
had read, and patients they had dealt with, 
through the media of classroom discussion and 
written “reaction reports.” It is likely that 
those students who were the most “open” in 
such communication impressed the faculty 
most favorably, and hence earned the higher 
grades. 

Another factor which the faculty considered 
in assigning course grades to students was the 
observed facility with which the students in- 
teracted with patients. An exploratory study 
showed that 17 sophomore and junior stu- 
dents who received high ratings from their 
clinical instructors on “ability to establish 
close, communicating relationships with pa- 
tients” had higher total disclosure scores (p 
< 05) than 17 matched students who were 
rated poor on this ability. The self-disclosure 
questionnaires had been administered a year 
prior to the time of rating. This finding too 
suggests that the higher-disclosing students 
were best able to elicit disclosure from pa- 


tients, producing thereby a favorable impres- 
sion upon their instructors, 


That the self-disclosure scores are not in- 
dices of intelligence, or of more general aca- 
demic aptitude, is attested by an insignificant 


Sidney M. Jourard 


correlation of .07 that was found in a sample 
of 52 freshmen nursing students between total 
score on the ACE Psychological Examination 
and total self-disclosure scores. Moreover, the 
correlation in the present study between self- 
disclosure scores and grades in nonnursing 
courses was not significant, suggesting that 
the attributes measured by the self-disclosure 
questionnaire played a lesser role in perform- 
ance in more strictly academic courses. 

The question may be raised whether the 
self-disclosure questionnaires employed here 
could have predicted which subjects would 
fail in the program, or would leave it for 
other reasons. A comparison was made be- 
tween the self-disclosure scores of 34 students 
who dropped from the nursing program prior 
to their senior year and those available from 
37 juniors and seniors tested at the same time. 
Mean total disclosure score of the dropout 
group was 59.60, SD 15.79, while that for the 
continuing group was 62.94, SD 15.86. The 
difference between means was not statistically 
significant. Inspection of the scores of those 
students who failed did not show any con- 
sistent trend toward higher or lower mean dis- 
closure scores than those found in students 
who left school for reasons of marriage, Or 
who changed courses; nor did the self-dis- 
closure scores of the failing students differ 
significantly from the mean total of the senior 
class. This observation is not intended to be 
conclusive, however; it is possible that study 
of a larger sample of failing students might 
point up some trends that were not apparent 
here. , 

Nursing is not the only profession in which 
the ability to establish close relationships with 
others is a desired trait; counseling, A 
chotherapy, teaching, military and industri 
leadership all require some measure of inter- 
personal competence. The present findings 
provide further evidence for the predictive 
validity of self-disclosure questionnaires, an 
suggest that they may have promise of pIO- 
viding a measure of one of the important non- 
intellective factors which might predict suc- 
cess in programs of training for these voca- 
tions, 

SuMMARY 


A self-disclosure questionnaire was adm 
istered to a group of students of nursing dU 


——— 


Self-Disclosure Scores and College Grades 247 


ing their sophomore year. At the conclusion 
of their senior years, grade-point averages 
were calculated for (a) all nursing courses 
taken during the 4 years of study, (b) nurs- 
ing courses taken in the junior and senior 
years, (c) all nonnursing courses taken dur- 
ing the 4-year program, and (d) all courses 
combined. Significant correlations were found 
between the scores for disclosure to Mother, 
Female Friend, and Total Disclosure, on the 
one hand, and all grade-point averages in 
which nursing courses were included. Disclo- 
sure to Father was significantly correlated 
with grades for all nursing courses, and grades 
in nursing courses taken in the junior and 


senior years. Disclosure to Male Friend was 
not significantly correlated with any of the 
grade-point averages. 


REFERENCES 


Jovnanp, S. M. Self-disclosure and other-cathexis. J. 
abnorm. soc. Psychol., 1959, 59, 428-431. 

Jourarp, S. M., & LaxpsMaN, M. J. Cognition, ca- 
thexis, and the “dyadic effect” in men's self-disclos- 
ing behavior. Merrill-Palmer Quart., 1960, 6, 178- 
186. 

Jourarp, S. M., & Lasakow, P. Some factors in self- 
disclosure. J. abnorm. soc. Psychol., 1958, 56, 91- 
98. 


(Received August 12, 1960) 


Journal of Applied Psychology 
1961, Vol. 43, No, 4, 248-250 


EFFECT OF COLORED ILLUMI 


ATION UPON 


PERCEIVED TEMPERATURE' 


PAUL C. BERRY 


Psychological Research Associates, Matrix Corporation 


There is an almost universal tendency to 
speak of green or blue as “cool” colors, and 
of red or orange as "warm" (von Allesch, 
1925). When colored lights are used to illumi- 
nate stage scenes, the “cool” hues give the au- 
dience the impression of a low temperature on 
stage, while the illusion of heat is produced 
by “warm” lighting (Ross, 1938). 

Can a person's judgment of the tempera- 
ture of the air around him be biased by the 
hue of his surroundings? Can this conven- 
tional association between colors and tempera- 


ture be used to improve the comfort of indi- 
viduals? 


Despite anecdotal evidence that this is pos- 
sible, the only experimental study with direct 
bearing on this question was conducted by 
Morgensen and English (1926). They asked 
subjects to judge the temperature of heating 
coils wrapped in paper of different colors. 


Subjects apparently judged the green ones as 
hottest. 


METHOD 


Because the use of the terms “warm” or “cool” for 
certain colors is so widespread, it seemed advisable 
to conduct the experiment so that subjects were un- 
aware of the experimenter’s interest in their judg- 
ments of comfort and temperature. Subjects were 
therefore instructed in a simple tracking task, and 
were led to believe that the experiment concerned 
the effects of colored illumination upon tracking per- 
formance, They were told that the special lights gen- 
erated considerable heat, and were therefore asked 
to indicate by a switch when the temperature rose 
to a point at which they began to feel uncomfortably 
warm. They were led to believe that the experi- 
menter needed this information to avoid having dis- 
comfort interfere with their performance on the 
tracking task, Shortly after they had turned on the 
signal switch, they were taken out of the experi- 
mental room for a rest while the room was cooled 
by exhausting the air from it. In fact, the light 
Sources produced negligible heat, and the heating was 
caused by electrical blower-heaters concealed in the 
room, thus providing uniform heating for all colors. 


* This work was conducted under contract with 
the Ford Motor Company, Purchase Order No. EP 
104076-W, and is published by permission, 


Five colors of light were used. Two “cool” colors, 
green and blue, were balanced with two "warm 
colors, yellow and amber, and white light also was 
used. Each subject was given five tests, one with 
each color, with the orders of presentation counter- 
balanced so that at the completion of testing for 25 
subjects, each color had appeared at each position 
of the sequence on five occasions. . 

Following these five tests (in which the subject 
was unaware of our interest in temperature judg- 
ments) the subject was shown samples of the five 
colors that had been used and was asked to rank 
them in order of the amount of heat they had trans- 
mitted. g 
_ Experimental Room. The tests were conducted in 
a specially constructed room 4' X 10', painted white. 
The ceiling of this room sloped from 8' high at one 
end, where the subject was seated, to 5' high at the 
far end, and was entirely covered by fo 
lamps shining through removable filters and fluted 
glass diffusers. The tilted ceiling was used in order to 
direct the light directly toward the subject. a i 

The experimental room was located within the 
PRA building, which is supplied with air condition 
ing maintaining the gencral ambient temperature 9 
72°F and 50% relative humidity. A large vented ex- 
haust fan in the experimental room permitted d 
moving the hot air generated during the tests, anc 
replacing it rapidly with the cool air from the BE 
rounding building. Heating of the room was pro- 
vided by six 1,000-watt blower-heaters concealed oe 
the room, controlled to produce a rise of 2°F pet 
minute in the room temperature. — IE S 

Measurements of temperature and humidity in 
room were provided by wet- and dry-bulb remote- 
reading thermometers. . id 

Irrelevant Task. A task was required that kates 
appear plausible for testing with the different pao 
lights, that would require the subject to look ar gi 
lighting, and that would be sufficiently interesting : 
prevent boredom during the testing. For this Pe 
pose, an American Automobile Association au 
trainer, Model 3539, was used.? pile 

In this auto-trainer, a standard set of automo 5 
controls is used to direct the motion of a model is 
on an endless-belt roadway mounted in front of t 5 
controls and sloping upward away from the aa 
This apparatus occupied the full length of the ee 
The low point of the sloping ceiling was immedia "d 
above the far end of the roadway, so that in po 
ing down the road, the subject was forced to inclu 


" ocial 
*The auto-trainer was kindly supplied by spec 


s 5: 
arrangement with the manufacturer, Allgaier ShoP* 
Inc., Arlington, Virginia. 


248 


Colored Illumination and Perceived Temperature 


à considerable area of the lighting diffuser in his field 
sion. 

Subjects were told that the experiment, which was 
being conducted for the Ford Motor Company, con- 
cerned the effect of colored lighting on “some of the 
skills related to driving." They were given some ini- 
tial practice with the auto-trainer, and then were 
told that during the test runs an automatic scoring 
device would record all the occasions in which the 
model car ran off the roadway. They were told that 
their score would be calculated in errors-per-mile, so 
that they should drive slowly and carefully. In fact, 
no scoring procedure for the driving task existed. 

Lighting Used. The entire area of the ceiling was 
covered with 18 daylight white (Champion F40-D 
preheat, 40 watts) and 18 blue fluorescent tubes 
(Sylvania F40-T12/B/RS preheat, 40 watts). The 
white tubes were used alone, or with the yellow, 
amber, and green filters, while the blue lamps were 
used alone to provide the blue source. Beneath the 
tubes were mounted racks for the inclusion of the- 
atrical gels to provide the color during the yellow, 
amber, and green sessions, and írosted glass filters to 
provide an evenly lighted expanse across the entire 
ceiling area. A 

The gelatins were selected on the basis of appro- 
Priateness of color and approximately equal visual 
brightness when in place. Vis al brightness was 
equated by the use of a Photronic Cell equipped. with 
a Viscor filter? which gives a response to different 
hues which closely matches the standard luminosity 
Curve for eye sensitivity adopted by the Interna- 
tional Commission on Illumination. The sources and 
filters for the five colors were as follows: white— 
Unfiltered daylight fluorescent; blue—unfiltered blue 
fluorescent, dominant frequency 478 me; green—day- 
light fluorescent with Roscolene gel 9-43, dominant 
frequency 535 ma; yellow—daylight fluorescent, with 
Roscolene gel 9-6, dominant frequency 578 mz; am- 
ber—daylight fluorescent with Roscolene gel 9-11, 

?minant frequency 588 ma. . 

he intensity of the four colored lights was ap- 
Proximately equal. Readings taken with a MacBeth 
lluminometer on the white surface of the roadway 
irectly under the light diffusers indicated about 280 
"Dparent foot-candles for the colored lights. The un- 
filtereq white light taken at the same position pro- 
Seed a reading of 420 apparent foot-candles. 
4 uration of Testing. Subjects were told to expect 
about 14 hours of testing. The duration of the test 
^d a single color ranged from 4 to 29 inmutes, a 

Veen the separate colors, the subject was returne 
our outer room (lighted by warm-white fluores- 

t tubes) for 10 minutes. . 
tha Oring Procedure. For each subject on each color, 
dica P mperature at the moment when the Ego 
bor led the onset of discomfort was ud lei td 
reap the wet- and the dry-bulb thermometers. The: 

“dings were combined to give the temperature- 

Idity index (once called the “Discomfort Index") 


a 
This i i y ial arrangement 
Wi Was kindly supplied by special 
Dj the Washington, D. C. office of the Weston 
"Sion of Daystrom, Inc. 


249 


TABLE 1 


ANALYSIS OF VARIANCE ON ‘TEMPERATURE-HUMIDITY 
INDEX FoR THRESHOLD OF DISCOMFORT 


Source SS df y F b 
Color 17.36 4 434 1.01 50 
Order 65.90 4 1648 3.84 .01 
Subjects 2004.34 24 108.51 25.31 <.001 


Residual (Inter- 
actions con- 
founded) 

Total 


39440 92 
3082.00 


26.79 


used by the United States Weather Bureau to evalu- 
ate the joint effects of temperature and humidity. 
This index is calculated as 

I= 4(D+W) +15 
where D is the dry-bulb temperature and W is the 
wet-bulb temperature. 

At the close of the tests, subjects were asked to 
rank the five colors according to the amount of heat 
they transmitted, with Rank 1 assigned to the hot- 
test, and Rank 5 to the coolest. 

Subjects. Subjects were 25 paid volunteers, all 
adults, all high school graduates, all able to drive an 
automobile, and none color-blind. They were 19 men 
and 6 women. 

RESULTS 


The mean temperature-humidity index score 
reported at the onset of discomfort was calcu- 
lated both by colors and by order of presenta- 
tion. For the five colors, these scores were: 
white, 80.5; yellow, 81.2; amber, 81.6; green, 
81.0; blue, 81.3. As will be seen in the analy- 
sis of variance (Table 1) the effect of color 
is almost exactly equal to that expected by 
chance alone. 

The order of presentation did produce a 
small significant effect, later runs showing 
greater heat tolerance than early ones. The 
mean scores by order were: first presentation, 
80.3; second, 80.7; third, 80.8; fourth, 81.7; 
fifth, 82.3. 

TABLE 2 
MEAN RANKINGS OF Five COLORS ACCORDING 
TO “Amount oF Heat TRANSMITTED” 


Color Rank 
White 2.64 

Yellow 2.54 
Amber 2.50 
Green 3.64 
Blue 3.68 


250 


The results of the rankings by the subjects 
at the close of the experiment are shown in 
Table 2. This effect was tested for significance 
by the Kendall-Friedman ranks test, and is 
clearly significant (W = .186; p < .01). 

It will be seen that the green and blue are 
ranked almost identically, and so are the yel- 
low and amber, while the white is at an inter- 
mediate position quite close to the yellow and 
amber. 

It might be supposed that in making these 
rankings after completion of tests, the subject 
was correctly recalling the temperature at 
which he had signaled the onset of discom- 
fort. However, the within-subjects correlation 
between actual temperature-humidity index 
score at onset of discomfort and subsequent 
ranking is only +.09, not significantly differ- 


ent from zero, 
CONCLUSION 


It may be concluded that: 
1. Subjects did not show any change in the 


levels of heat they would tolerate as a func- 
tion of the colors of illumination, and 


‘Paul C. Berry 


2. Subjects nevertheless persisted in the 
conventional belief that green and blue are 
“cool” colors when asked to rank the colors 
that they had experienced. ] 

It appears, therefore, that if colored illumi- 
nation were used to increase the comfort of 
persons exposed to uncomfortably warm con- 
ditions, the threshold of discomfort and hence 
the frequency of their complaints would prob- 
ably not be altered by the coloring (unless, 
of course, the coloring also produced a real 
change in temperature). On the other hand, 
it seems likely that a false belief in the effi- 
cacy of blue or green filters would be wide- 
spread. 


REFERENCES 


Morcensen, M. F., & Excusu, H. B. The apparent 
warmth of colors. Amer. J. Psychol., 1926, 37, 421- 
428. 

Ross, R. T. Studies in the psychology of the theater. 
Psychol. Rec., 1938, 2, 127-190. : j 

von Arrrscu, G. J. Die aesthetische Erscheinum&? 
weise der Farbe. Psychol. Forsch., 1925, 6, 1-91; 
215-281. 


(Received August 15, 1960) 


Ures of achievement motivation 


Praisal program. 


Pothesized that persons who score 


Journal of Applied Psychology 
1961, Vol. 45, No. 4, 251-256 


NEED FOR ACHIEVEMENT 


AND RISK PREFERENCES 


AS THEY RELATE TO ATTITUDES TOWARD REWARD 


SYSTEMS AND PERFO 


RMANCE APPRAISAL 


IN AN INDUSTRIAL SETTING 


HERBERT H. MEYER axp 


WILLIAM B. WALKER? 


General Electric Company, New York City 


This report covers the second phase of an 


exploratory study to investigate the possibility 
of improving predictions of certain attitudes 
and behaviors of individuals in an industrial 
setting by the use of measures designed to 
assess achievement motivation. In the first 
phase of this study, reported earlier (Meyer, 
Walker, & Litwin, 1961), it was found that 
managers in jobs with definite entrepreneurial 
characteristics scored significantly higher than 
specialists of comparable age, education, and 
job level, whose jobs were judged to be non- 
entrepreneurial in nature, on a thematic ap- 
perceptive measure of y [ 

Achievement). The managers also differed sig- 
nificantly from the specialists in showing pref- 


need for achievement (n 


erence for intermediate level risks (near 50- 
50), a behavior characteristic which Atkinson 
and others (Atkinson, 1957, 1958; Atkinson 


& Litwin, 1960; McClelland, 1958) have dem- 
onstrated is indicative of high achievement 


Motivation or low fe 


ar of failure motivation. 
The second phase of the study, reported 
here, dealt with relationships gemens oct 
to two different types of salary plans and the 
General Electric Company performance ap- 
Based on the implications 
dies, most of which were 
it was hy- 
high on n 
persons scoring 


Of past research stu € : 
Conducted in academic settings, 


Chievement, as contin to 
OW on this measure, would: . 
1. Prefer.a salary plan based on the merit 
Pay or “pay for performance" philosophy; 


and would have a less favorable attitude to- 
EN atlas 


* The authors wish to express their D aec 2 
Lose H., Litwin of Harvard University, who a 
®borated on the first phase of this study reported 
ditior (Meyer, Walker, & Litwin, 1961), uc Me 

© predictor measures and for his advisory assist- 

ce on this phase of the study. 
: 251 


ward a “scheduled increases” type of salary 
plan where pay is based more on age and 
length of company experience 

2. Express more favorable attitudes toward 
the performance appraisal program which 
involves the periodic evaluation of results 
achieved on the job and the feedback of this 
information to the individual : 

3. Be more likely to take action to improve 
performance on the basis of the performance 
appraisal feedback. 

Since past research studies have shown that 
the tendency to prefer intermediate type risks 
in a risk-taking situation has been found to be 
correlated with other indexes of high achieve- 
ment motivation and with low anxiety or 
“fear of failure" motivation, it was also ex- 
pected that a measure of risk preferences 
would show significant correlations with the 
reactions listed above. 

Some of the evidence on which the above 
hypotheses are based is summarized by Mc- 
Clelland (1961). He cites research evidence 
to show that the person high in achievement 
motivation prefers a pay-for-performance re- 
ward system because it provides an objective 
means for indicating one's level of compe- 
tence. In other words, the monetary reward 
serves as a symbol of achievement. 

As additional evidence for the hypotheses, 
Litwin (1958) found that when individuals 
were asked to set the amount of reward which 
should be granted for different levels of 
achievement in a game of skill, the gradient 
of the reward curve set by high scorers on 
n Achievement was significantly steeper than 
that set by low scorers on this measure. That 
is, persons high in achievement motivation 
felt that the rate of pay for accomplishing 
more difficult tasks should increase more 
rapidly than did the subjects who scored low. 

Evidence which would lead to the third hy- 


252 


pothesis, which deals with the reaction to per- 
formance appraisal feedback, is provided in a 
study by French (1958). In this study it was 
found that task-relevant feedback information 
given to subjects working on problems was 
significantly more effective in improving per- 
formance for high need achievers than it was 
for those low in n Achievement. In another 
experiment by French (1955) it was also 
shown that the degree to which improvement 
was shown in performance after feedback of 
results information to subjects working on a 


coding test was correlated with a measure of 
achievement motivation. 


METHOD 


The same subjects and predictor variables used in 
the first phase of the study reported previously 
(Meyer et al., 1961) were used in this second phase 
of the study. Thirty-one managers in manufacturing 
components and 31 specialists in staff-type jobs, 
matched with the managers for age, education, po- 
sition level, and length of service, completed a short 
Risk Preference Questionnaire and wrote brief stories 
to six thematic apperception pictures scored for n 
Achievement, n Power, and n Affiliation (see Atkin- 


son, 1957, Recommended Multiple Purpose Set A, 
Appendix III). 


Each subject was also interv; 
indication of his attitudes toward and reactions to 
salary plan variations and the performance appraisal 
program. Specifically, the interviewers rated the de- 
gree to which each participant’s attitudes w. 
able or unfavorable toward: 

1. A Merit Pay Plan: the type of salary plan pres- 
ently used by the General Electric Company for pro- 
fessional or managerial personnel in which pay level, 
within a broad range established for the position, is 
based on performance ie, the manager appraises 
performance and establishes rewards in direct pro- 
portion to perceived excellence of accomplishment, 

2. A Scheduled Increases Plan: the type of plan 
typically used for lower level salaried positions, 
Where increases are automatic on a scheduled basis. 
Outstanding performance can generally be rewarded 
only by promotion. 

3. The Performance Appraisal Program: the peri- 
odic feedback of the boss’s appraisal of performance 
results to the individual. 

The interviewe 
detail his experi 


iewed to obtain an 


ere favor- 


Herbert H. Meyer and William B. Walker 


during the interview. He might have indicated, for 
example, that he had enrolled in a Human Relations 
training course at the suggestion of his supervisor, or 
that he had reorganized his method for keeping scrap 
records so that he could get an earlier indication of 
needed corrections, or that he had made a special 
effort to get reports in on time since this was men- 
tioned by the manager as an item which needed im- 
provement. 

During the interview the subject was also pre- 
sented with a list of factors which might be consid- 
ered in determining a man’s pay, such as “Age, 
“Length of Experience,” “Impact of individual’s con- 
tribution on the success of the component,” and 
“Status or level of individuals who must be con- 
tacted (in or out of the company).” He was asked 
to rank these in order of the importance that he felt 
should be given them in determining a man’s pay. 
It was predicted that persons high in achievement 
motivation would rank Impact of Contribution high; 
and Age and Length of Experience low in impor- 
tance. It was also hypothesized that persons scoring 
high in n Power would rank Status of Contacts high 
in importance as a factor determining pay. . 

These interview variables, dealing with reactions to 
reward systems and performance appraisal, were cor- 
related with motive measures and other variables in 
the analysis for this phase of the study. 


RESULTS 


Table 1 presents correlations between e 
different predictor and control measures an 
the participants’ attitudes toward salary plan 
variables.? . ^ 

Considering first the motive scores, it 
be seen that n Achievement showed no signt 1 
cant relationships to any of the aterdina 
variables. Need for Power, on the other hani 
was found to be positively related to rankings 
of the factor Status of Contacts for the a 
agers and for the total group, as hypothesize E 
and shows some correlations which appo, 
significance in the direction hypothesized fo 
n Achievement. 


" f 
? For this analysis the distributions of ratings a- 
salary plan alternatives and rankings of O 
termining pay were dichotomized as near as pO: ela- 
to the median in each case. Therefore, all Cor sk 
tions are biserials except for those with the s. In 
Preference Questionnaire, which are terraguar S, 
computing significance levels of these coeffic the 
one-tailed tests were used in those cases wea 
direction of the relationship was clearly predic cor! 
For the Risk Preference Questionnaire a nn 
indicates a. preference for intermediate level ris min- 
Age and Length of Experience as factors deter?" s 
ing pay were combined for this analysis, since Mp 
assigned these factors by the participants were Mi 
correlated. 


TABLE 1 


CORRELATIONS BETWEEN MOTIVE MEASURES, AGE, EDUCATION, AND ATTITUDES TOWARD SALARY PLAN VARIABLES 


Motive Measures 


n Achievement n Power Risk preference Age Education 
Variable Managers—Specialists Managers—Specialists Managers—Specialists Managers—Specialists Managers—Specialists 
Total Total Total Total Total 
Interviewers’ ratings of partici- 
ipants' attitudes toward: 
Merit pay plan —14 13 36 04 5o» 18 25 —15 —19 =920 
04 22 28 08 —06 
Scheduled increases plan 25 E. 15 —32 —20 20 —08 08 54** —03 —61** 
05 —09 05 21 —30 
Performance appraisal program 09 0t 20 03 —15 Er d 14 —16 =A 00 
10 11 24 01 —06 
Action taken based on last per- 04 32 —08 05 48* 68** —46* —48* —18 —38 
formance appraisal 21 06 63** —48* —26 
Ranked importance of factors 
determining pay—ranks 
high: 
Age and experience 05 —30 —10 1t —38 30 46** —20 —40** 08 
—18 01 —18 09 —20 
Impact of contribution —12 —12 08 45* 63** —42 11 —18 —05 0 
—08 26 25 01 a ” 
Status of contacts 19 22 Si —01 20 -32 —32 40 30 49* 
18 30* —15 02 ain 


nificant at 5% level of confidence, 
nificant at 1% level of confidence. 


sopngjiy pup qu2ua2uppp sof poowr 


N 


254 


Preference for intermediate risks, as an in- 
dex of achievement motivation, shows several 
correlations in the directions hypothesized, 
although the tendencies are somewhat incon- 
sistent. This measures correlates positively, 
for example, with attitudes toward merit pay 
as predicted, in the case of the Managers, but 
the same correlation is only very low for the 
Specialists. Attitudes toward the performance 
appraisal program, on the other hand, are sig- 
nificantly correlated with risk preferences, as 
predicted, for the Specialists, but not corre- 
lated with these attitudes for the Managers. 
Risk preference also correlates in the expected 
directions with rankings, by the Manager 
group, of Age and Experience and Impact of 
Contribution as important in determining pay. 
Low correlations in the opposite direction 
were found for the Specialists. 

Consistent results for the two groups are 
found only for correlations between (a) risk 
preferences and the subjects’ reports of 
Whether or not constructive action was taken 
to improve on-the-job performance based on 
their last performance appraisal discussions, 
and (b) this latter variable and age.? Other 
correlations, for example those between age, 
education, and attitudinal variables, are in 
the expected direction for one of the groups 
but are either near zero or in the opposite di- 
rection for the other group.* 


Discussion 


The results of this exploratory study were 
inconsistent. Some of the correlations between 
the motivation or risk preference measures 
and attitudes toward alternative reward sys- 


® Of the 62 participants in this study, 49 (23 of the 
Managers and 26 of the Specialists) had had per- 
formance appraisal discussions with their managers 
within the last year or two, which they could de- 
scribe in some detail. Of the 49, 21 (12 of the Man- 
agers and 9 of the Specialists) reported that they had 
taken some specific constructive action to improve 
performance, based on suggestions made or topics 
discussed in the feedback interview, 

* Intercorrelations for the predictor and control 
variables (presented in the columns of Table 1) were 
given in the report of the first phase of this study 
(Meyer et al, 1961). Risk preference, which shows 
the greatest number of significant correlations with 
dependent variables in Table 1, was not found to be 


significantly correlated with any of the other predic- 
tor or control variables. 


Herbert H. Meyer and William B. Walker 


tems and reactions to the performance ap- 
praisal program were significant in directions 
predicted. Other expected correlations were 
not found. It was encouraging to note, how- 
ever, that the statistically significant correla- 
tions were in the directions predicted. 

If the occurrence of correlations between 
predictor and criterion variables in directions 
expected is considered as a “backward valida- 
tion” of predictor variables, it would appear 
that the Risk Preference Questionnaire is bet- 
ter than scores on the thematic apperceptive 
measure of n Achievement as an indicator of 
those aspects of achievement motivation on 
which the hypotheses for this study were 
based. All of the hypotheses that were con- 
firmed involved the Risk Preference Ques- 
tionnaire rather than the thematic appercep- 
tion scores. This finding might be explained 
by the possibility that the n Achievement 
score is not a sufficiently reliable measure to 
expect consistent correlations to appear in 
groups as small as those employed in this 
study. It is also possible, of course, that n 
Achievement is a variable which has no Và 
lidity in this situation. , 

The fact that the Risk Preference Question- 
naire did correlate with other variables as pre 
dicted does not, of course, indicate with any 
certainty that it is measuring “achievement 
motivation” of the same type as is appr 
by thematic apperception. Other causes E 
possibly account for the correlations foun A 
It may be, for example, that the type of risk 
preferences shown indicates a degree of nega” 
tive, fear-of-failure motivation, as suggeste 
by Meyer, Walker, and Litwin in a report 0 
the first phase of this study. a 

The critical behavior measured by thig 
questionnaire may not be preference for p. 
termediate-level odds in a risk situation, 2" 
would be expected according to the motiv? 
tion theory on which the study was bases 
While the Risk Preference Questionnaire W2 
scored by including preference fcr either Ps 
or short odds in one category and preferen É 
for intermediate odds in the other, a eod 
inspection of the data revealed that few a 
sons showed preference for long-odds altern } 
tives. The significant correlations found E 
peared to be generated by the subjects T d 
chose short odds or the “safer bet" alter? 


Need for Achievement and Attitudes 


tives. In fact, in the correlational analyses, if 
the few who expressed preference for long 
odds were included with the subjects who 
preferred intermediate odds, the significant 
correlations found were generally increased 
rather than decreased. Thus, perhaps prefer- 
ence for safe bets is the critical variable, in- 
dicating a need for security or fear-of-failure 
motivation. This may or may not be indica- 
tive of Jow need for achievement. In the re- 
lated report mentioned above, evidence was 
cited to support the possibility that fear of 
failure is more likely to be associated with 
a moderate level of achievement motivation 
than with low need for achievement. 

The high correlation found between risk 
preference behavior and the subjects’ reports 
of whether or not they took constructive 
action based on the performance appraisal 
discussion provides additional evidence to 
support the hypothesis that this measure 1s 
assessing some aspect of achievement motiva- 
tion, As was mentioned above, however, it is 
not necessary to interpret this result as indi- 
cating only that the aspect of motivation 
measured is a positive, success oriented type. 
Again, the results can be explained just as 
well by assuming that the critical behavior 
tapped by this measure is the preference for 
safe bets, which may indicate a negative, fear- 
of-failure type of motivation. . 

The fact that an individual took action 
based on the appraisal feedback discussion 
must have meant that the manager discussed 
an area of needed improvement. This might 
be interpreted by the subject as a failure on 
his part in some aspect of job performance. 
Atkinson (1957) explains that, according E 

is motivation theory model, the effect of Ht M 
Ure on the person with a high level of anxiety 
about failing is to decrease his. hosan 
and cause him to avoid the situation. He also 
cites research evidence to support this theo 

his explanation could account for the ao 
that the men who showed preference for bó 
Safe bets in this study were also found to : 
ess likely to take an action basec 
9n the ance appraisal. ANA 

If ee neue Tear of failure motivation in 
xplaining the results, it is possible to make a 
800d case for the expectation that persons low 
M achievement motivation would take con- 


255 


structive action to improve performance based 
on appraisal feedback, if the pay-for-perform- 
ance salary plan were working well. According 
to motivation theory and research evidence, 
money for itself is not a primary incentive for 
the person who is high in n Achievement. For 
that person, money is an effective incentive 
only to the extent that it provides an objec- 
tive symbol of success. If other symbols 
equally positive and objectively associated 
with success were available, they should be 
just as effective as incentives for the high 
need achievers. 

For the person low in n Achievement, on 
the other hand, the monetary reward may 
provide the incentive needed to motivate im- 
proved performance on his part. McClelland 
(1961) summarizes research evidence to show 
that money as such is not an effective incen- 
tive for the person high in need for achieve- 
ment, but very effective for the person low in 
this need. In two studies by Atkinson (1958, 
Ch, 19-20), for example, it was found that 
the addition of monetary rewards for success- 
ful completion of tasks improved the perform- 
ance of persons low in n Achievement to the 
extent that they wiped out significant differ- 
ences in performance in favor of the high need 
achievers which had been found when no such 
rewards were offered. 

On the basis of this evidence, one would 
predict that, if the persons low in n Achieve- 
ment expected monetary rewards to be based 
directly on excellence of performance, they 
would be likely to take action to improve per- 
formance. The fact that these persons were 
found to be significantly less likely to take 
such action could be explained very well by 
the rationale presented above if the additional 
assumption were made that the merit pay 
plan was actually not being administered ac- 
cording to the theory on which such a plan is 
based. The interviews provided some evidence 
to show that this may have been true. 


SUMMARY 


This exploratory study was designed to test 
the hypotheses that persons high in achieve- 
ment motivation would (a) prefer a salary 
plan based on a pay-for-performance philoso- 
phy; (b) have a favorable attitude toward the 


256 Herbert H. Meyer and William B. Walker 


periodic appraisal of performance and feed- 
back of this appraisal to the individual, and 
(c) be likely to take action to improve per- 
formance based on the feedback of appraisal 
data. The subjects used to test these hypothe- 
ses consisted of 31 Managers and 31 Spe- 
cialists in the manufacturing sections of four 
departments. 

If "need for Achievement" scores on a the- 
matic apperception measure are used as the 
measure of achievement motivation, the re- 
sults of the study would have to be consid- 
ered as negative. However, if risk preference 
behavior, which had been found in previous 
research to have predictive validity as a meas- 
ure of achievement motivation, is used as the 
dependent variable, the results could be in- 
terpreted as largely positive. This measure 
was found to be correlated with reactions to 
salary plan variables mentioned above in 
enough of the relationships explored to indi- 
cate that more definitive studies, employing 
larger groups and an improved measure of 
risk-taking behavior, might provide more defi- 


nite confirmation of the hypotheses consid- 
ered. 


REFERENCES 


ATKINSON, J. W. Motivational determinants of risk- 
taking behav: Psychol. Rev., 1957, 64, 359-372. 

ATKINSON, J. W. (Ed.) Motives in fantasy, action, 
and society. Princeton: Van Nostrand, 1958, 

ATKINSON, J. W., & Litwin, G. H. Achievement mo- 
tive and test anxiety conceived as motive to ap- 
proach success and motive to avoid failure. J. ab- 
norm. soc. Psychol., 1960, 60, 52-63. 

Frencu, E. G. Some characteristics of achievement 
motivation. J. exp. Psychol., 1955, 50, 232-236. 
FnENCH, E. G. Effects of the interaction of motiva- 
tion and feedback on task performance. In J. W. 
Atkinson (Ed.), Motives in fantasy, action, and 
society. Princeton: Van Nostrand, 1958. Pp. 400- 

408. 

Litwin, G. H. Motives and expectancy as deter- 
minants of preference for degrees of risk. Unpub- 
lished honors thesis, University of Michigan, 1958. 

McCtetianp, D. C. Risk taking in children with high 
and low need for achievement. In J. W. Atkinson 
(Ed.), Motives in fantasy, action, and society. 
Princeton: Van Nostrand, 1958. Pp. 306-321. 

McCLELLAND, D. C. The achieving society. Prince- 
to: an Nostrand, 1961, in press. 

Meyer, H. H, Warker, W. B. & LITWIN, G. H. 
Motive patterns and risk performance associated 
with entrepreneurship. J. abmorm. soc. Psychol, 
1961, in press. 


(Received August 18, 1960) 


| 


Journal of Applied Psychology 
1961, Vol. 45, No. 4, 257-261 


THE ADMINISTRATIVE JUDGMENT TEST AS RELATED 
TO DESCRIPTIONS OF EXECUTIVE 
JUDGMENT BEHAVIORS? 


GARLIE A. FOREHAND ax» HAROLD GUETZKOW 


Center for Programs in Government Administration, University of Chicago 


The Administrative Judgment Test (AJT) 
was developed after World War II by the 
United States Civil Service Commission. The 
test "attempts to measure broad understand- 
ing of the processes of administration . . . 
whether government or private" (Mandell, 
1950, p. 145). Mandell has demonstrated its 
ability to predict administrative success, as 
evidenced by performance ratings and grade 
level (1950, 1956). This study endeavors to 
explore which intellectual facets of the exer- 
cise of judgment within the decision making 
process are related to over-all performance on 
the AJT. 

The AJT is in multiple-choice form; the 
55-item test No. 600, from the commission's 
Series No. la, was used in this study. The 


items 


Include problems in the relationships between the 
headquarters and field offices in an organization, and 
those between research and operating personnel. They 
also include problems on the timing of programs and 
the organization of the office of an administrator. 
he test does not attempt to measure teenaa] 
nowledge in such fields as personnel or budgeting 
Or accounting (Mandell, 1950). 


Forty rating scales were used by superiors 
and peers to describe decision making ca- 
Pabilities and styles in the rendering of ad- 
ministrative judgments. The final or fortieth 
Scale was global in nature, asking Tn a 
Sal, how effectively do you feel the executive 


» cw u 
This work was supported by 


Public Affairs Division of the Ford j 
op Mer. for Programs in a mang 
Niversity College of the University 0 — 

pe orization to use the Admin st: ive ape 
Stap, fr research purposes was given by Pd 
a al, Director of the Bureau of Program anc otal l 
Gu Of the United States Civil Service € snap - 
cop dance and expedition in its use under p od 
s tions were given by Albert Maslow and Milton 


a grant from the 
Foundation to the 
istration 


Ge Vell in Washington, and by Joseph ch. 
Ju ial, C i erg, and Quii 
Guerin Rosenthal, C. S. Littenberg 


in the Chicago regional area. 


to 


x 


exercises judgment in his decision-making?” 
Cluster analyses of the preceding 39 items 
were made separately for superior and peer 
ratings in a previous analysis (Forehand & 
Guetzkow, in press). 

By correlating the AJT with the items and 
cluster combinations used by superiors and 
peers in describing executive judgments the 
following attempt is made to obtain further 
understanding of the content of this test. 


PROCEDURE 


Subjects. One hundred and twenty-seven persons 
holding administrative positions in agencies of the 
United States government served as subjects in the 
study, The executives represented 27 agencies; their 
civil service grade levels ranged from 9 to 17, with 
a median of 13.2. 

Ratings. The superiors’ ratings were made by each 
subject's immediate organizational superior; the peer 
ratings were made by a co-worker, selected by the 
superior as one who worked closely with the sub- 
ject. The items are described in Table 1; the source 
and rationale of the variables are described elsewhere 
(Forehand & Guetzkow, in press). 

Analysis. Product-moment correlation coefficients 
were used to assess the relationship between total 
scores on the AJT and individual items, and be- 
tween total score and sets of items combined to de- 
fine clusters based upon those obtained earlier from 
the superior and peer ratings. These latter combina- 
tions are called “cluster-combinations” in this paper. 

Because the specific ratings of the performance 
items shared in an over-all *halo effect" (Forehand 
& Guetzkow, in press), adjusted scores were defined 
for each specific rating. The adjusted score consisted 
of the difference between the rating and the pre- 
dicted value of the rating based upon its regression 
with the rating of Item 40, "general effectiveness in 
exercising judgment." The correlations between test 
scores and the adjusted ratings were determined alge- 
braically: by computing the part correlations of test 
scores with a given rating adjusted for its relation- 
ship with the general rating (DuBois, 1957, pp. 60- 
61). An adjusted rating was defined for the superior 
and peer ratings separately, and for the sum of the 


two. 1 

The particular items used in the cluster-combina- 
tions are indicated in Table 2. Both raw scores and 
adjusted scores on these combinations were corre- 


258 Garlie A. Forehand and Harold Guetzkow 


TABLE 1 


CORRELATIONS OF RATINGS BY SUPERIORS AND PEERS WITH TOTAL SCORES ON THE 
ADMINISTRATIVE JUDGMENT TEST 


(N=127) 
Original Ratings Adjusted Ratings 
Item , : z 5 = = x ined 
No. Description of Item Superior Peer Combined Superior Peer | Combine 
1. Decides effectively when appropriate pre- 
cedent is available 2i* 35 .21* 02 07 Ol 
2. Documents decisions carefully for review by R 
others AT 16 20* 04 41 05 
3. Peforms well when bases for decision are m 
clear and definite .26** A3 sage si .03 05 
4. Performs well even when bases for decision 
are vague and ambiguous .23** 45 24 13 05 08 
5. Performs well in making routine decisions :22* J8* — 24** .06 10 07 
6. Decides well when relevant precedent is 
lacking Qe .20* — 9** Al 2 Al 
7. Assumes responsibility completely when 
decisions are to be postaudited 08 .19* 6 01 Al —.03 
8. Makes critical or highly important decisions 
adequately -20* - 44 .22* .03 04 02 
9. Considers all relevant information 21* .19* .25** .03 12 10 
10. Makes simple, straightforward decisions 
very satisfactorily 16 06 14 —.05 —.05 —.09 
11. Competently makes decisions even when 
facets of decision must be concealed 32** 02 25°" 20* —.05 08 
12. Formulates decisions capable of being given 
adequate public defense 29** 16 2o** 15 08 16 
13. Skilled in considering goals in his judgment 
making 14 21* 22* =z 1: 4 02 
14. Works well under heavy decision pressure 17 08 16 OL —.01 —.03 
15. Deals effectively with decisions involving 
staff functions (i.e., budget, personnel) 26** 09. :21* 12 —.04 03 
16. Deals effectively with decisions involving 
general administration matters .28** 13 26** BU 04 09 
17. Effectively obtains group consensus on de- 
cisions 10 —.01 .06 —.07 =i 5 
18. Decides well in situations with some policy a 
implications 4*5 —— 259 — age .21* gt — 23 
19. Competence in “policy implementing” as 04 
contrasted with “policy making” decisions 1 .27** .24** —i08 oot ‘00 
20. Embodies technical know-how in judgments 45 Jn 16 01 08 ; 
21. Considers implicit, hidden aspects of situ- 15 
ation 36** 42 .29** 25** .03 m 
22. Refrains from decisions when appropriate 22* J0* — 25 10 .18* d 
23. Avoids over-commitment and retains flexi- ' 03 
pitty 12 08 — 3 09 SE: 
24. Screens facts for relevance and accuracy E y d wom 34“ -19> .19* p 
25. Keeps details in perspective 18* 15 21* 03 .08 p 
26. ‘Takes initiative when appropriate 12 09 14 —.05 02 7? E 
27. Keeps within scope of given authority 24 5 25** 16 al 7 
28. Embodies checks of adequacy in decisions .19* 42 20* 07 04 08 


* r =.17 significant at .05 level, 
** r =,23 significant at .01 level, 


v 


Executive Judgment Behaviors 


259 


TABLE 1 (Continued) 


Original Ratings 


Adjusted Ratings 


Ttem 
No. Description of Item Superior Peer Combined Superior Peer Combined 
29. Makes full use of given authority .08 Zt .12 .06 .03 —.03 
30. Makes judgments which are internally con- 

sistent 32** 16 -30** .20* 07 15 
31. Decisions contain clues for execution .04 Ap AS —.09 12 201) 
32. l'eeds-back past results in reshaping ob- 

jectives 29** 16 Bow AS: .08 17 
33. Considers wide range of alternatives 27** 6 -28** 16 .09 3 
34. Focuses attention on definition of the 

problem gor 16 Fe oa 16 07 .18* 
35. Sticks to objective realities, avoiding wish- 

fulness 29** 03 .21* .03 —.08 08 
36. Works in an orderly, systematic fashion 24** .08 .20* .08 —.02 .02 
37. "Times decisions appropriately 23** 23e gs .23** 45 14 
38. Tackles decisions with assurance and self- 

confidence AS 44 -19* 14 06 .03 
39. Decisions are as appropriate in long-run as 

24** A3 24e A3 -06 .09 


in short-run 


lated with the over-all score on the AJT. Scores on 
the cluster-combinations and the combined superior 
and co-worker ratings are the sums of the original 
variables in standard score form. The correlations 
involving both sets of scores were computed by 
means of the formula for the correlation of sums of 


Standard scores. 
RESULTS 


The correlations of AJT performance with 
both the raw and adjusted ratings are pre- 
sented in Table 1. Correlations exceeding .174 
are significant at the 5% level; correlations 
exceeding .228 are significant at the xp 

The correlation of the global “over-all per- 
formance” rating (Item 40) with AJT is 28 
Or superiors and .17 for peers. It will be 
Noted that 26 or 67% of the 39 original rat- 
Ings by superiors have correlations with the 

T which are significantly greater than zero, 
While only 11 or 28% of the raw ratings made 

Y peers are significantly related to the AJT. 
Most of these relationships may be accounted 
for by the correlation of the specific ratings 
With the over-all rating of effectiveness. When 

* ratings are adjusted, only six or 15% of 

* ratings by superiors and four or 10% of 
„© ratings by co-workers are significantly re- 
“Med to AJT performance. 

Two adjusted item-ratings from both su- 

“riors and peers are significantly related to 


AJT performance: performance in situations 
with policy making implications (Item 18) 
and screening facts for relevance and accu- 
racy (Item 24). In addition, four superior 
ratings and two peer ratings are related to 
test performance. The difference between su- 
periors and peers is interpretable as the dif- 
ference between a hierarchical and collateral 
view of decision making, Superiors see their 
more adequate subordinates, as appraised by 
AJT, as those who operate in terms of the 
demands of the decision making hierarchy, 
making their decisions internally consistent 
(Item 30), working successfully with implicit, 
concealed facets of the problem situation 
(Items 11 and 21), and timing their decisions 
appropriately vis-à-vis the context of their 
situations (Item 37). Peers, on the other 
hand, view their more adequate co-workers, 
as appraised by AJT, as those who implement 
rather than make policy (Item 19) and those 
who avoid decision when it is prudent to do 
so (Item 22). 

By utilizing the results of the previous 
cluster analysis of the ratings (Forehand & 
Guetzkow, in press), complementary but 
somewhat different conclusions emerge. As in- 
dicated above, cluster-combination scores were 
developed for superiors’ and co-workers’ rat- 
ings with the items listed in Table 2. Some of 


260 


Garlie A. Forehand and Harold Guetzkow 


TABLE 2 
M i ToraL SCORES ON THE 
CORRELATIONS OF CLUSTER-COMBINATION RATINGS OF SUPERIORS AND PEERS witn TorAL SCORES ON 


ADMINISTRATIVE JUDGMENT TEST 
(N=127) 


Descriptions of Cluster-Combinations 


Original Ratings 


Adjusted Ratings 


(with item numbers) Superior Peer Combined Superior Peer Combined 
y err 38) 45 44 ag —.06 .05 —.02 
RN, 27, 39) 30 20* Ae 18* 14 .18* 
Ec 24, 28) aoe ae BT 3399 12 aa 
km tun me 34** .19* 3% 19* 10 16 
ey H 7, 10, 11) 25** 45 3y .07 06 —.04 
NS Ab 13, 18) Si gp ue 44 .19* 19" 
Policy Execution i " 
(Items 14, 15, 16, 19, 20, 25, 30) 26** .19* a 09 10 . 


* r —.17 significant at .05 level. 
** r —.23 significant at .01 level. 


the cluster-combinations seem to characterize 
the personal and intellectual style of the ex- 
ecutive as he makes judgments: Self-Confi- 
dence, Cautiousness, Discernment (describing 
the tendency to penetrate into the less obvi- 
ous aspects of administrative problems), and 
Analytic Decision Making Capability. The re- 
maining cluster-combinations appear to cen- 
ter upon characteristics of the decision-mak- 
ing situation: Bureaucratic Decision Making 
Capability (characterized by reliance on rules, 
precedents and policies), Policy Applying 
Abilities (utilizing policies as over-all guides 
to decision), and Policy Executing Abilities 
(describing skills in the direct implementa- 
tion of policy in specific situations), 

Table 2 presents the product-moment cor- 
relations of these cluster-combination scores 
for superiors and peers, both original and ad- 
justed. Once halo is removed, the superiors 
from their hierarchical perspective see their 
high scoring subordinates as cautious, discern- 
ing, and analytic. The peers, observing later- 
ally, see their high scoring co-workers in terms 
of capability in applying agency policy. It is 
interesting to note that in neither the origi- 
nal nor adjusted ratings for neither superiors 
nor peers is the self-confidence displayed in 


judgment making by the executive related to 
performance as evaluated by the AJT. 


DISCUSSION 


Mandell (1950) compared the AJT a 
job performance, as measured by ber qur 
ratings of colleagues and superiors and eu 
grade level in several samples. He LN ied 
correlations ranging from .50 to .68 kan 
performance, and from .28 to .56 ge] pent 
level. He further reported that tests o 8 pe 
eral mental ability had validities sient 
tially lower, in general, than . . : [those the 
the administrative-judgment test,’ despite. cm 
fact that the mental abilities test had et 
tions in the .60s with the AJT. The peto 
observations are presented by way of pe 
paring the results of the present study 
those reported by Mandell. — rable 

A “job performance" criterion compa! sues 
to Mandell’s "collective ratings of ae 
and superiors” was defined as the sum 0 1 by 
total scores on the 39 specific ratings given 
superiors and peers. Total scores on the Ol 
have a correlation of .53 (NV = 127, P noe p 
with this criterion and of .32 (N d il 
X 01) with grade level. Results UBER gs 
combining superiors’ and peers’ item ra 


Executive Judgment Behaviors 


and cluster-combination scores are presented 
in Tables 1 and 2, respectively. A test of men- 
tal ability, the Thurstone Test of Mental 
Alertness (Science Research Associates, 1952), 
has a correlation with the composite rating of 
22 (N = 48, p> .05) and a correlation of 
05 with grade level (W = 75, p > .05). The 
Correlation between the Test of Mental Alert- 
ness and the AJT is .58 (V = 80, P < .01). 
hese results closely parallel those reported 
by Mandell. It should be noted that the sam- 
ple employed in this study was heterogeneous 
With respect to agency, while those studied by 
Mandell were relatively homogeneous. 
Mandell (1950) states that the coverage of 
the AJT is broad and general. The fact that 
50 many of the original ratings by superiors 
(67%) correlate with the over-all AJT score 
Substantiates this position. The fact that the 
item and cluster-combination scores tend to 
Correlate so poorly, once adjusted for halo, 
again substantiates Mandell's belief that the 
T is an over-all test without pockets of 
Specificity, The fact that both the halo rating 
and the cluster-combination by the superior 
Were better correlated with AJT than the 
Corresponding measures by the peers further 
evidences the hierarchical perspective of the 
- Its strongest relations were WI 
Concerned with intellectual processes. : 
It should be emphasized that this study is 
concerned with only a partial criterion of the 
T's validity: that of cognitive and intel- 
“ctual factors in executive judgment. This 
Study should be viewed, therefore, not as an 
Attempt to validate again the AJT as an over- 
all instrument, but rather as an analysis of 
v; Me factors which contribute to its gonere 
s idity, This approach is consonant wit a 
trategy of building knowledge about eremi 
© performance by studying segments of the 


th items 


Ptal Problem (Guetzkow & Fortan e 
jn SS), understanding 0 
Mngt A more adequate ae additional 


se, Ument would entail stu 
Ments, such as social C 
Ri in human relations, 
tsals, and organizationa 


haracteristics in- 
motivational ap- 
] characteristics 


261 


broader than those of the immediate decision 
situation. 
SUMMARY 


Relationships between the Administrative 
Judgment Test of the United States Civil 
Service Commission and ratings by superiors 
and peers of aspects of executive judgment 
were studied in a group of 127 federal ad- 
ministrators. A wide variety of rated charac- 
teristics correlated significantly with the test. 
When the ratings were adjusted to correct for 
the influence of the rater’s general impression 
of effectiveness in making judgments, both 
superiors and peers described executives who 
scored highly on the test as competent in 
making decisions with policy making impli- 
cations and in screening factual information 
for relevance and accuracy. Superiors, in ad- 
dition, described high scoring executives as 
making decisions which are internally consist- 
ent, working successfully with implicit, con- 
cealed facets of the problem situation, and 
timing their decisions appropriately. Peers, on 
the other hand, viewed their more adequate 
co-workers, as appraised by the test, as those 
capable of implementing rather than making 
policy, and those who know when it is prudent 
to avoid decision. 


REFERENCES 


DuBois, P. H. Multivariate correlational analysis. 
New York: Harper, 1957. 

Forenanp, G. A, & GuETZKOW, H. Superiors and 
co-workers’ descriptions of the judgment and de- 


cision-making activities of government executives. 


Mgmt. Sci., in press. 

Gurrzkow, H. & Fomrmawp, G. A. A research 
strategy for partial knowledge useful in the selec- 
tion of executives. In R. Taguiri (Ed.), Research 
in executive selection. Boston: Harvard Univer., 
Graduate School of Business, in press. 

Manpett, M. The Administrative Judgment Test, J, 
appl. Psychol., 1950, 34, 145-147. 

Manvett, M. Validity information exchange, No. 
9-2. Personnel Psychol., 1956, 9, 105. 

ScrexcE RESEARCH ÁssocATES. Examiner manual for 
the Thurstone Test of Mental Alertness. (3rd ed.) 


Chicago: SRA, 1952. 
(Received August 19, 1960) 


al of Applied Psychology 
190. ul 45, No. 4, 262-267 


COMPARISON OF SEVERAL STYLES OF TYPOGRAPHY 
IN ENGLISH ' 


EDMUND B. COLEMAN ann INSUP KIM 


Johns Hopkins University 


When reading a sentence, we perceive not 
just a single word at each fixation, but groups 
of three or four words. Similarly when under- 
standing a sentence, we do not understand it 
as a linear string of discrete words: we or- 
ganize the words into meaningful phrases, or- 
ganize these phrases into clauses, and so on. 
Stress, pitch, and juncture help organize the 
words into the correct phrases in speech. Re- 
cently, at least three typographies have been 
proposed that help organize the words cor- 
rectly in reading. 


In several books (1943) 

Lillian Lieber 

attempted to aid understanding 
by printing 

only a single phrase 

on each line. 


Andrews (1949) proposed a second style 
called “square span.” 


In square span 


is arranged 
the material 


in double-line blocks. 


North and Jenkins (1951) argued that perhaps 
the main advantage of square span lay in its 
grouping of words into “thought units.” They pro- 
posed a third style called spaced unit in which 


the words are grouped into thought units by 
spaces, 


In all three of the above styles there are 
cues that help the reader cluster the words 
into the supposedly correct groups. 

The square span has one other advantage 
Over spaced unit (indeed this is the source of 
its rather puzzling name): it utilizes vertical 
eyespan as well as horizontal. But square span 
does not seem to be the arrangement that 
maximally exploits vertical eyespan. It would 
seem to be maximally exploited by a vertical 
arrangement. 

The vertical style below concentrates more 
words within the eyespan, even if we assume 


1 Part of this investigation was carried out during 


the tenure of a National Science Foundation predoc- 
toral fellowship, #30125. 


the effective span to contract as more relevant 
words are concentrated within it. 


In the 
vertical 
style 
the 
fixations 
would 
overlap 
and 
maximally 
exploit 
peripheral 
vision. 


So far, previous attempts (Andrews, 1949; 
Klare, Nichols, & Shuford, 1957; Nahinsky; 
1956; North & Jenkins, 1951) to find 2 
more efficient style of typography have shown 
conflicting results. North and Jenkins (1951); 
using long passages, found spaced unit d 
perior in both reading speed and comprehen 
sion. But Klare et al. (1957), using a simile? 
procedure, did not find a significant differen¢ 
between spaced and conventional. e. 

With oriental languages which traditiona 
use vertical arrangements, Chang (1942) Pd 
Sato (1958) found little difference ber 
the vertical and horizontal arrangemen E 
Sato and Kusajima (1958) found that spa" 
ing into “meaningful units” increased re? 
ing speed in Japanese. 

Ngati i these conflicting results, kr 
should note that there are many possible M 
rangements within each experimental sty é 
Perhaps some of the styles were found ar $ 
inferior to conventional simply because be 
experimenters selected an inefficient Hd 
ment of the style. In this series of €XP' "T 
ments, we will attempt to select the most € 
cient arrangements within each style. p the 

On the other hand, perhaps some ° din£ 
styles were found inferior because rede se- 
habits were interfering with the new ar o 
ments. In this experiment, the styles Md 
tested under two conditions: the tradit! 


262 


Comparison and Styles of Typography 


reading situation and a tachistoscope pres- 
entation. Because of novelty, in the tachisto- 
scope presentation, reading habits should be a 
relatively less important factor than in read- 
ing longer selections. 

This study will also consider two styles that 
have not been previously investigated: verti- 
cal and Lieber's arrangement of a single 


phrase to a line. 


PROCEDURE 


This paper reports two fairly independent series of 
experiments: in four of them, the subjects read long 
passages of about 1500 words; the other four used 
a tachistoscope presentation? Undergraduates from 
Johns Hopkins were used as subjects in both series. 
Altogether 267 undergraduates were used in cight in- 
dependent experiments. . 

Reading Series. The subject first read the instruc- 
. tions which were typed in the style to be investi- 
gated and thus familiarized himself with the mate- 
rials as well as the procedure of the experiment. For 
warm-up, he read a practice passage of about 200 
words and took a test on it. Then, under a time 
limit, he read several experimental passages, each of 
about 1500 words typed on standard 84” X 11” pa- 
per; and immediately after finishing each passage he 
took an objective test on it. He was scored on words 
read per minute and number of questions answered. 

All the reading experiments use à Lindquist (1953, 
D. 289) Type V design. To administer this design in 
reading experiments for three different styles: phe 
lects three passages, prepares each passage in all rs ree 
Styles, and casts these nine preparations into a 
different graeco-latin squares 1n which the differen 
Passages make up the latin factor and order of ee 
entation is the graeco factor. The subjects are di- 


i bject 
vided into 3 groups (see Footnote 2), and a su 
Teads one passage in cach of the experimental styles 

Tachistoscope Experiment. The material RT 
72 sentences of three different lengths—24, 16, an 


Words—which were arbitrarily selected from = 
eral nontechnical books. These sentences were equal- 
led in the number of words and pire Lad were 
Ypewritten, one sentence on one white card. 

, The sentences of 24, 16, or 8 words were pei 
1 à Gerbrands tachistoscope for the dollowane | id 
tions: 4" 9.5” or 1", The subject came to the ex 
Perimental room and read the instructions iue 
athe style to be investigated), and was Me fie 
Miliarized with both the procedures and the ma 
tS The subject was then given four or pues 

Ce sentences in the tachistoscope. He read he € 
t Sentence, and as soon as the light in me 
h Chistoscope went out, told the experimenter wha 
or 224 read. The subject was scored for the number 
Sy, orrect words he reproduced—regardless of order. 

Onyms were not counted as correct. 
re conducted by Cole- 


2 Th 
€ reading experiments we 
en eriments by Kim. 


> and the tachistoscope XP 


263 


All the tachistoscope experiments used a Lindquist 
(1953, p. 273) Type II design. To administer this de- 
sign with our material for three different treatments: 
one divides his subjects into three groups, divides thc 
72 sentences into three sets of approximately equal 
difficulty, and tvpes each of the 72 sentences in all 
three treatments. A subject reads one set of sentences 
in cach experimental style. 


RESULTS 


Reading Experiment on Spaced Style 


Sixty-four subjects were used in this experi- 
ment which compared conventional to three 
variations of spaced style. Four passages of 
1500 words each were selected from Wood- 
worth (1938), each passage was typed in all 
four styles, and a 25-item test was prepared 
and mimeographed for each passage. 

All the variations in this experiment use 
only two spaces between units. Furthermore, 
in all variations, two spaces were used after 
commas and three spaces were used after pe- 
riods. The three variations of the spaced style 
were: (a) Clauses were separated by an extra 
space. In addition to spacing between clauses, 
phrases that modified a clause as a whole were 
separated by an extra space. The units in this 


TABLE 1 
Mean SCORES ON PASSAGES FOR 
READING EXPERIMENTS 


Words Questions 
Read Answered 
per per 
Style Minute Subject 
Vertical Style 
Conventional 260 12.8 
8 letters, spaced into phrases 237 12.6 
8 letters, unspaced 230 11.9 
1 word, spaced 224 11.4 
One Phrase to a Line 
Conventional 260.2 13.6 
Long phrases to a line 251.1 13.6 
Short phrases to a line 242.5 14.7 
Spaced Style* 
Conventional 260 13.2 
Space between clauses 261.2 13.3 
Space between grammatical 
units 261.6 13.8 
261.2 14.0 


Space between phrases 


this experiment, subjects read more difficult selections, 
Sins Were multiplied by constants (260/204 and 13.2/10.3) 
to make all styles roughly comparable. 


264 


variation average 7.25 words. (5) Grammati- 
cal units—the subject plus its modifiers, verb 
plus its modifiers, and object plus its modi- 
fiers—were separated from one another when- 
ever any two such adjacent units totaled more 
than five words. This was in addition to the 
separations above. The units in this variation 
averaged 4.72 words. (c) Phrases—preposi- 
tional, infinitive, and participial—were sepa- 
rated. The units in this variation averaged 
3.35 words. 

The results reported in Table 1 are disap- 
pointing. The overall difference between styles 
is not significant using analysis of variance 
with a pooled error term. Even the ¢ ratio be- 
tween conventional and the best version of 
spaced style is not significant. 


Reading Experiment on Vertical Style 


The first experiment used 32 subjects and 
conventional was compared to three varia- 
tions of the vertical style. Four passages of 
1680 words each were selected from de Kruif 
(1932), each selection was typed in all four 
styles, and a 20-question multiple choice test 
was prepared for each selection. Two words 
were grouped on the same line only if they 
totaled less than eight spaces. The three varia- 
tions of the vertical square span were: (a) 
Two words grouped together on the same line 
only if they totaled less than eight spaces, 
and the sentences were not spaced into thought 
units. (5) Two words grouped together on the 
same line only if they totaled less than eight 
spaces, and the sentences were spaced into 
thought units. These thought units averaged 
4.8 words to a unit. (c) Only one word to a 
line and spaced into the same thought units 
as in b. 

"Table 1 shows that the results are again dis- 
appointing. If we use reading speed as the cri- 
terion, a sign test is adequate to show that 
conventional is read significantly faster than 
all three vertical Styles. Ignoring ties, the 
number of subjects who read conventional 
faster is 25 to 5, 24 to 6, and 26 to 1. These 
ratios are all significant beyond the .01 level. 

lf we use comprehension as the measure, 
all the differences still favor the conventional 
style, but none are significant. In other words, 
even though untrained subjects do read verti- 
cal style slower, they must read it more ac- 


Edmund B. Coleman and Insup Kim 


curately (Accuracy — number of questions an- 
swered per word read). With difficult mate- 
rial, some may argue that accuracy is more 
important than speed. Therefore a second ex- 
periment was designed in which subjects were 
given ample time to finish each passage. 

In the second experiment two 232-word se- 
lections from Geldard (1953) and two of the 
previous selections from de Kruif were used. 
Only the most promising of the vertical styles, 
eight letters to a line and spaced, was com- 
pared with conventional. Forty subjects were 
given 2 minutes to read the short selections 
and 9 minutes to read the long ones. This was 
ample time for all to finish, and most of them 
glanced over the material a second time. The 
total results are 828 questions answered out 
of a possible 1120 for the conventional style; 
816 for the vertical style. This difference, of 
course, is not significant. In terms of reading 
for content as opposed to reading for speed, 
there does not seem to be much difference 1D 
favor of conventional even for unpracticed 
subjects. 


Results of Lieber's Arrangement 


Only 18 subjects were available for this 
study—it is hardly more than a pilot study- 
The materials were three of the previously 
described selections from de Krulf. Each y 
typed in three styles: conventional, a veni 
of rather short units that averaged 4.3 wor 
on each line, and a version in which oen 
the previous units were combined into ber 
ones so that this version averaged 6.1 wor 
to a line. The results reported in Table ! are 
not significant. 


Results for Tachistoscope Series 


Three preliminary experiments were pa 
to select the best arrangement within ees 
styles: vertical, spaced, and square pit 
Table 2 shows the arrangements that Tat 
tested, Analysis of variance indicated t 
there was no significant difference amo e e 
variations within each style. Neverthe 
within any style the arrangement e a 
largest total score was selected for a 
overall comparison of styles. at? 

In the fourth experiment, the selected «i 
rangements were compared with convention 
The data were analyzed in terms of the n" 


-— 


Comparison and Styles of Typography 


TABLE 2 


ARRANGEMENTS TESTED IN PRELIMINARY 
EXPERIMENTS WITH TACHISTOSCOPE 


Arrangements in Vertical Style 
No. of Words No. of Words Total Score 


in a Unit * toaLine No. of Words 
4.8 1 3879.6 
4.8 1.7 4110.6^ 
8 1.7 4031.4 
8 1 3883.7 


Arrangements in Spaced Style 


No. of Wi No. of Space "Total Score 
0. of Words No. of Spaces No. of Words 


in a Unit between Units 
2.7 3 3905.1 
4.7 3 3920. 
2.7 2 4004.1 
4.7 2 4010.7^ 


Arrangements in Square Span 
Total Score 


No. of Jo. of Lines 
ina war ij a Unit No. of Words 
2 2 1211 
3 3 1235 
5 2 1257* 


yles. 


a 
Selected for overall comparison of st A 
a 


R 1 s 
Were or Anar span, only 21 subjects 


24-word sentences 


ber of words correctly reproduced. The aver- 
age number of words reproduced per subject 
9r all lengths combined were 74.47 for con- 
‘tional, 80.22 for spaced style, 80.75 for 
‘Ware span style, and 86.50 for vertical style. 


TABLE 3 

: 

ANatysis OF VARIANCE FOR OVERALL COMPARISON 
or STYLES 


df MS 


3 35 


etw, : 
Th een subjects 


Taction: Sty! tences 
è (between Styles X Sets of sen 3773 
Troy (between) 32 639.5 
ora aa 
thin subs 108 
i Jeg Ubiects 3 869.6 
let Sentences 3 394.3 
Action: St. Sets of sen- 
tees (with x Se 6 142 
mE (Within) 96 7386 
9ta] 143 


265 


Table 3 presents the analysis of variance for 
the latin square design. 

The F ratio for differences of styles is sig- 
nificant beyond the .01 level. (The difference 
between sets of sentences and the interaction 
between styles and sets of sentences is not 
significant.) 

By a £ test, all three new styles were sig- 
nificantly superior to the conventional style 
beyond the .01 level. 


Discussion 
Spaced 


There have now been some six or seven 
comparisons of spaced and unspaced (Lieber’s 
arrangement is also a technique for spacing). 
Almost all of them have favored the spaced 
version, but only two of them, the one by 
North and Jenkins and the one by Kim were 
significant. Although suggestive, the Japanese 
study on spacing by Sato and Kusajima 
should not be directly compared to the Eng- 
lish studies. Japanese has three different kinds 
of syllabary. The alternation of these kinds of 
characters are organization cues similar to the 
spaces between English words. 

When all the studies are considered, they 
probably argue for a slight advantage in favor 
of spaced even with untrained subjects. Per- 
haps the effect is slight and variable because 
no one has yet used the optimum arrange- 
ment. Before he can divine the most effective 
way to group phrases, an investigator prob- 
ably needs a considerable knowledge of con- 
stituent analysis and American patterns of 
stress, juncture, and intonation. The present 
investigators, at least, are rather deficient in 
such linguistic training. So far, there have 
been no photographs of eye movements while 
reading the experimental styles. Such photo- 
graphs might give insights into more effective 
ways to group phrases. 

On the other hand, the effect may be slight 
and variable only because established reading 
habits are interfering with the cues from spac- 
e unfamiliar tachistoscope presenta- 


ing. In th : i 
ffect was significant and fairly large. 


tion, the e 


Vertical 
In the tachistoscope series, the best vertical 
ypography was definitely superior to 


style of t Baie 
the best of the other three styles. Studies with 


266 


Chinese and Japanese characters, in which 
horizontal seems to be better than vertical, if 
not in comprehension then at least in speed, 
appear to contradict the results of this study. 
A possible explanation for this discrepancy in 
results may be in differences in the arrange- 
ments of characters. In the vertical arrange- 
ment in oriental languages, the characters are 
printed one above another so that the lines of 
print are long, narrow strings quite similar to 
the long, narrow strings in the horizontal ar- 
rangement. It would be somewhat analogous 
to English written in the following style. 


I 


BPH»5Hs 


In oriental languages, their vertical style 
Wastes horizontal eyespan just as their. (and 
our) horizontal style wastes vertical eyespan. 

When the subjects read long passages, there 
was no significant difference between vertical 
and conventional so far as number of ques- 
tions answered was concerned. But in terms 
of reading Speed, the results directly contra- 
dict those with the tachistoscope: vertical was 
read significantly slower than conventional, 

The contradiction can probably be ex- 
plained by reading habits, For years, the sub- 
ject had been reading material in which the 
line above was not immediately related to the 
words that would come next. Similarly, be- 
cause the line below was unrelated, by the 
time he reached it he had forgotten that he 
had halí-perceived these words before. Far 
from ever using these peripheral cues, he had 
to purposely suppress them for the words to 
make connected sense. 

In the novel tachistoscope situation, the 
contrary habits of Suppression apparently did 
not interfere as much as when long passages 
were read. If this explanation is Correct, given 
some training the reader might become able to 
exploit the additional eyespan to some ad- 
vantage. He might learn not to suppress the 
cues from the line above and the line below 
his fixation. 

Other explanations for the discrepancy be- 
tween the tachistoscope and the reading ex- 


Edmund B. Coleman and Insup Kim 


periments are possible of course. Perhaps the 
tachistoscope presentation adds some un- 
known constraint that favors the vertical 
style. Or the discrepancy may have been 
caused by a scoring difference. Perhaps con- 
ventional style is so easy to read that the sub- 
ject gets the meaning but fails to report the 
right word (which is what was scored here). 
However, a study by Tinker (1955) shows 
that readers do make rapid improvement in 
learning to read the vertical style. When all 
is considered, the tachistoscope results seem 
promising enough to justify some further ex- 
perimentation with trained subjects. 


SUMMARY 


Five styles of typography—spaced units, 
vertical, square span, an arrangement of pe 
phrase per line, and conventional—were X 
pared using untrained subjects. This paper ud 
ports two fairly independent series of exper j 
ments: a series using a tachistoscope ves 
entation, and a series in which the subjec 
read passages of about 1500 words. . 3 

In the familiar situation of reading ux 
passages, the subjects were apparently nang 
to suppress established reading habits tha k 
terfered with the new styles. CAN i 8 
read significantly faster than the yere 
rangement. However in terms of compre d 
Sion (total number of questions pode 
conventional showed only a slight, en E 
cant advantage. When it was bes ere 
spaced, conventional showed an equally S 
and nonsignificant disadvantage. E 

But in the tachistoscope series, in v 
the subjects were reading in an pci ^ 
Situation, three experimental styles poa. 
nificantly.superior to conventional. t 
spaced, and square span were all agree 
superior, vertical being the most sept 

The tachistoscope series suggests that 
are advantages to the new uringenidaE e an 
the reading series suggests that subjects si 
be trained to read these new arrangen 
before the advantage will be fully realizeC- 


hich 
jar 


REFERENCES pite 
ANDREWS, R. B. Reading power unlimited. Tex. 
look, 1949, 33, 20-21. j the 


Cuanc, C. Y. A study of the relative merits p 
vertical and horizontal lines in reee 
print. Arch. Psychol, NY, 1942, No. 276. 


ines? 


f 


Comparison and Styles oj Typography 261 


DE Knurm, P. H. Men against death. New York: 
Harcourt, Brace, 1932. 

Getparp, F. A. The human senses. New York: Wiley, 
1953. 

Krang, G. R, Nicnors, W. H., & Suurorp, E. H. 
The relation of typographic arrangement to the 
learning of technical material. J. appl. Psychol., 
1957, 41, 41-45. 

Lieser, L. The education of T. C. Mits. New York: 
Norton, 1943. . 

Lixpouisr, E. F. Design and analysis of experiment 
in psychology and education. Boston: Houghton 
Mifflin, 1953. . . 

Nautnsxy, I. D. The influence oi certain typographi- 
cal arrangements upon span of visual comprehen- 
sion, J. appl. Psychol., 1956, 40, 37-39. 


NonrH, A. J., & JENKINS, L. B. Reading speed and 
comprehension as a function of typography. J. 
appl. Psychol., 1951, 35, 2 3 

Sato, Y. Letters and printing. In K. Endo (Ed.), 
Science of language. Tokyo: Nakayama Shoten, 
1958. 

Sato, Y., & KusagIMa. Letters and printing. In K. 
Endo (Ed.), Science of language. Tokyo: Naka- 
yama Shoten, 1958. 

TINKER, M. A. Perceptual and oculomotor efficiency 
in reading materials in vertical and horizontal ar- 
rangements. Amer, J. Psychol., 1955, 68, 444-449. 

Woopworrtn, R. S. Reading and eye-movement. In 
R. S. Woodworth, Experimental psychology. New 
York: Holt, 1938. 


(Received August 22, 1960) 


al of Applied Psychology 
oer" vd 45, No. 4, 268-270 


COMPARISON OF PERFORMANCE ON MANUAL 
AND ELECTRIC TYPEWRITERS 


ROBERT C. DROEGE snp BEATRICE M. HILL 


United States Employment Service 


The electric typewriter is gaining greater 
popularity in schools and business offices to- 
day and is often used to the exclusion of the 
manual typewriter. Along with this wider use 
of the electric typewriter has come a growing 
demand for persons qualified to operate it. 
Although the electric typewriter is becoming 
more popular, there are still many typists who 
have had experience only on a manual type- 
writer. How well do these individuals perform 
when they switch over to an electric type- 
writer? Can their performance be predicted 
with any degree of accuracy? The specific pur- 
pose of this study was to compare perform- 
ance on the electric typewriter and manual 
typewriter and investigate the possibility of 
developing conversion tables which would en- 
able the prediction of performance on an elec- 
tric typewriter from performance on a manual 
typewriter. 

METHOD 

The sample consisted of 575 individu. 
least 6 months of experience on the el 
writer. Of those individuals originally te: 
study 38 were dropped from the sample 
made an excessive number of errors (m 
on either the manual typewriter or the 
writer. It was obvious after examining 
individuals who made more than 60 errors in the 10- 
minute time limit, that these individuals could not 
meet the minimum standard of Proficiency for typ- 
ing jobs. It was not even necessary to obtain a for- 
mal total of the number of errors made to reach 
this conclusion. With the elimination of the obvi- 
ously unqualified, the results become more applicable 
to those with at least a minimum of typing profi- 
ciency. The final sample consisted of individuals from 
Seven states. The N in the various states ranged from 
33 to 105. All individuals were either Employment 


Service applicants or employed typists who volun- 
teered for the testing, 


All persons in the 


als with at 
ectric type- 
sted for the 
because they 
ore than 60) 
electric type- 
the papers of 


administered either at local Employment Service of 
fices or at the employed workers’ stations of wo 
by Employment Service personnel, according to Ü a 
instructions contained in the USES Guide to the Ust 
of Typing, Dictation, and Spelling Tests. 


RESULTS 


Table 1 shows the ranges, means, and stand- 
ard deviations of words-per-minute (wpm) 
and error scores for the total sample. Table 
shows the correlations between scores on the 
manual and electric typewriter, the differences 
in mean scores and standard deviations Z 
scores on the two typewriters, and the ¢ d 
corresponding to the differences for the to 
sample. d 

The following points are based on the Tê 
sults shown in Tables 1 and 2: . -— 

l. There is a substantial relationship ) 
tween wpm scores obtained on manual i ip 
electric typewriters but a lower sis 
between error scores on the two typewr! a 

2. The 9.17 difference between ep 
scores is significant at the .01 level, ind! " 
ing an advantage in favor of the electric typ 
writer, m. 

; The 1.70 difference in standard ber 
tions of wpm scores on the two typewriters 


TABLE 1 
RANGES, MEANS, AND STANDARD DEVIATIONS Lo 
Wonns-PER-MiNUTE AND ERROR Scores ON T 
ELECTRIC AND MANUAL TYPEWRITER FOR 
THE TOTAL SAMPLE 


(N=575) 
Electric Manual 
WPM 
R 36-101 28-86 
M 65.28 56.11 
SD 11.22 9.51 
Errors 
R 0-57 0-59 
M 14.80 16.93 
SD 10.19 11.90 


268 


> 


Manual and Electric Typewriters 


TABLE 2 
PRODUCT-MOMENT CORRELATIONS BETWEEN SCORES 
ON THE MANUAL AND ELECTRIC TYPEWRITER (7), 
DIFFERENCES IN MEAN SCORES AND STANDARD 
DEVIATIONS OF SCORES ON THE Two TYPE- 
writers (D), AND t RATIOS CORRE- 
SPONDING TO THESE DIFFERENCES 
FOR THE TOTAL SAMPLE 
(N=575) 


Difference in 


Differences in Standard 
Means Deviations 
r 7 D t D t 
WPM — 36 917 29.60 170 78 
Errors — 62 — —244 524  -171 59 


Significant at the .01 level, indicating a greater 
degree of variability in wpm typed on the 
electric typewriter than on the manual type- 
Writer, 

4. The 2.14 difference between mean error 
Scores is significant at the .01 level, indicat- 
ing that fewer errors are made when the elec- 
tric typewriter is used. 
ti 5. The 1.71 difference in 

‘Ons of error scores on the tw 

Significant at the .01 level, 
Variability in error scores made on the 
tric typewriter than on the manual type- 
Writer, 


standard devia- 
o typewriters 1s 


indicating less 
elec- 


DISCUSSION 


The size of the relationship between scores 
" the manual and electric typewriter p 
anction of the reliability of the test and the 


set of differences associated with ness 
the two ki ewriters. To get som 
pec ds of these two 


idea 
of ive importance 
the relative imp e the corre- 


fa 

ctors, it i mpar 

A it ary to comp : 
» is necessary ith estimates 


lons obtained in this study 

Tepeat reliability of the test Ore oa 3 
les on the same kind of tYDeWD^ - 

muimates of the reliability of USES Typing 


Form No. 6 are avail 


in YMent Service applican 
ma 1958 with pos Test Form No. $ an 
sh, tal typewriters and then retested (after a 
type break) with the same form on the same 
“Writers. The test was administered with a 


269 


5-minute time limit instead of the 10-minute 
time limit used in the present manual-electric 
typewriter study. The results showed that 
the correlation between initial and retest 
scores was .97 for words per minute and .85 
for errors. The differences between these cor- 
relations and those in the present study (.76 
for words per minute and .62 for errors) are 
significant at the .01 level, indicating that 
differences associated with operation of man- 
ual and electric typewriters are of some 
importance. 

It may be concluded that an assessment of 
performance on the manual typewriter does 
not provide a completely satisfactory basis 
for predicting performance on the electric 
typewriter. While it is true that the correla- 
tion between wpm scores (7 = .76) is sub- 
stantial enough to permit a fairly satisfactory 
prediction of speed qualifications on the elec- 
tric typewriter, the relationship between 
error scores (r = .62) is not high enough to 
permit satisfactory prediction of accuracy 
qualifications on the electric typewriter. 

The difference in mean wpm scores and 
error scores attained on the two kinds of 
typewriters might have been smaller if more 
stringent controls had been placed on the 
experience requirements for the examinees. 
It will be remembered that all examinees were 
required to have at least 6 months of experi- 
ence on the electric typewriter, but there was 
no minimum amount of experience required 
on the manual typewriter. It is quite possible 
that if the amount and recency of experience 
on both kinds of typewriters had been equated, 
the scores obtained on the electric and manual 
typewriters would have been more in line 
with each other. Further, the effect of changes 
in emotional attitude occasioned by the shift 
from the electric typewriter (to which the 
examinees were more acustomed) to the 
manual typewriter should not be overlooked 
as a factor affecting both speed and accuracy. 
While the extent to which emotional factors 
affected performance in this study is un- 
known, these factors very likely did have some 
influence on mean scores and correlations be- 
tween scores of individuals in the sample. 
Difficulty in adapting to the typing touch of 
the manual typewriter after using the electric 
typewriter is another possible factor which 


270 


may have affected the results in this study. 
Various specific difficulties typists have in ad- 
justing to a change from an electric to a 
manual typewriter have been indicated by 
students in a study of training on manual and 
electric typewriters (Savage, 1953). 

The United States Employment Service is 
developing a new set of typing tests based 
on current material used in busines and in- 
dustry. Because of the results obtained in this 
study, it has been decided that these new 
tests should be standardized separately for 
manual and electric typewriter operators. 
Once norms for the electric typewriter are 
established, applicants for positions calling 
for qualified electric typewriter operators can 
be tested on an electric typewriter and their 
scores can be evaluated against norms based 
on performance of a representative sample of 
electric typewriter operators. 


SUMMARY 


The possibility of developing conversion 
tables which would enable prediction of 
performance on an electric typewriter from 
performance on a manual typewriter was in- 


Robert C. Droege and Beatrice M. Hill 


vestigated. The sample consisted of 575 ex- 
perienced electric typewriter operators from 
seven states. They were tested initially on an 
electric typewriter and then retested on à 
manual typewriter. The tests used were equiv- 
alent forms of the United States Employment 
Service Typing Test. The results showed that 
there was a substantial relationship (7 = .76) 
between words-per-minute scores on manual 
and electric typewriters but only a moderate 
relationship (r = .62) between error scores 
on the two typewriters. A further analysis of 
the data showed that, on the average, indi- 
viduals in the sample typed about nine words 
per minute faster and made about two fewer 
errors on the electric typewriter than on the 
manual typewriter. These differences were 
statistically significant. The standard devia 
tions of words per minute scores and error 
scores for the two typewriters were also SIE 
nificantly different. 


REFERENCE 


2 ee 
Savace, W. G. Report on electric and manual typ 
writing. J. bus. Educ., 1953, 29, 111-112. 


(Received August 23, 1960) 


od 


Journal of Applied Psychology 
1961, Vol. 45, No. a ITR 


AN 


AMERICAN APPLICATION OF EYSENCK'S SHORT 


NEUROTICISM AND EXTRAVERSION SCALES 


WILLIAM D. WELLS, HOWARD E. EGETH, an» NANCY P. WRAY 


Newark Colleges, Rutgers University 


In 1958 Eysenck published an article de- 
Scribing two short personality scales suitable 
lor use in market research interviewing 
(Eysenck, 1958). He reported that the scales 
Save reliable measurements on two independ- 
ent personality dimensions, and that there 
Were significant differences along these di- 
mensions among certain segments of the Eng- 
lish consumer population. It has recently been 
Possible to use Eysenck’s scales in some stud- 
les of a group of American housewives, and 
the comparisons between his results and ours 
Seem worth noting. 


SUBJECTS AND METHOD 
a with data 


In comparing Eysenck’s dat 
ant to keep 


from the present study, it is import 
in mind some major differences between the 
two sets of respondents. Eysenck's 1,600 sub- 
jects were drawn from a nationwide sample 
or the English consumer popul i 
Pling procedure insured proper proportions of 
rban and rural residents, and proportional 
Tepresentation of the English population 1 
erms of economic class, sex, and age- he 
189 subjects in our study were middle and 
Dper-middle class housewives living 1n the 


WE a T 
€tropolitan area surrounding Newark, New 


“tsey, They were considerably better edu- 
“ted than the American national average 
6% had at least some college). and they 
ne" primarily of Jewish extraction. ird 
SPonses to Eysenck's scales Were collecte 
i tryout of a number of questionnaire 
ems in a pilot investigation. Because of the 
‘liminary nature of the work, we had made 
attempt: to draw à representative sample 


4 defined population. 


ation. His sam- 


[v 


of 


RESULTS 
nces between sets of 
orthwhile to find out 
iffered significantly 
the neuroticism or 


e Spite of the differe 
wreondents, it seemed Wi 
fro, €T our subjects d 


Ysenck’s subjects 0n 


extraversion dimensions. They did, on both 
(Table 1). The women we interviewed were 
significantly Jess neurotic and significantly 
less extraverted than the women in Eysenck’s 
sample. Although the inadequacy of our 
sample precludes any general comparison of 
English women ys. American women, it is 
perhaps comforting to note that the person- 
ality test scores of at least one group of Ameri- 
can women run counter to opinions expressed 
by some observers of the American scene. 

Of more general interest are the results ob- 
tained from analysis of the scales themselves. 
Eysenck reported a correlation of —05 be- 
tween his neuroticism scale and his extraver- 
sion scale. This lack of correlation seemed 
puzzling in view of the fact that maximum 
scores on both of Eysenck’s scales are ob- 
tained by answering “yes” to all questions. 
Recently published research on “agreeing 
response set” (Couch & Kenniston, 1960) 
indicates that scales scored in this way almost 
always correlate at least moderately. 

In our data, the correlation between Ey- 
senck’s neuroticism scale and his extraversion 
scale was —08. This finding confirms the fact 
that the two scales are uncorrelated, despite 


TABLE 1 
COMPARISON OF AMERICAN AND ENGLISH RESPONDENTS 
on Eysenck’s NEUROTICISM AND 
EXTRAVERSION SCALES 


Neuroticism Scale Extraversion Scale 


Statistic American English American English 
M E 1.00" 92 1718 
SD 342 342 2.96 2.97 
N 180 800 180 800 
1 3.69 324 


ive separate means for men and women 
in his report. We esi nated the means for the 800 women in 
his sample from tatement that "on N, the women have a 
score roughly 3 SD higher than the men; on E, the men have 
score e roughly | SD higher than the women." He gave .15 and 

overall means for N and E, respectively. Both ¢ 


96 as the ov 
Vos are significant beyond the 01 level. 


» Eysenck did not gi 


271 


W. D. Wells, H. E. Egeth, and N. P. Wray 


TABLE 2 


Comparison OF SMOKERS 


AND NONSMOKERS ON EYsENCK'S SCALES AND ON AGREEING REsPONSE ITEMS 


Neuroticism Extraversion Agreeing Response 
Statistic Smokers Nonsmokers Smokers Nonsmokers Smokers Nonsmokers 
M —.12 08 1.23 46 80.2 76.6 
SD 3.06 3.88 3.10 2.74 11.1 11.9 
N 107 73 107 73 107 73 
t 37 1.75 2.03* 


* Significant beyond .01 level, 


the potential mutual variance introduced by 
the scoring system. Some information on the 
influence of the Scoring system is provided 
by correlations between Eysenck's scales and 
a scale of 20 items found by Couch and Ken- 
niston to be highly loaded on the agreeing 
response set dimension. Eysenck's neuroti- 
Cism scale correlated .35 with these items, 
and his extraversion scale correlated .28 with 
them. With an N or 180, both these correla- 
tions are significant well beyond the .01 level. 
Further understanding of the Scales can be 
Eained from an examination of their reliabili- 
ties. Eysenck reported corrected Split-half re- 
liabilities of .79 for the neuroticism scale and 
-71 for the extraversion scale. In our data the 
corrected split-half reliability for the neuroti- 
cism scale held up reasonably well—it was 42, 
But for extraversion the Corrected split-half 
reliability was a disappointing .41, For the 
Couch and Kennitson items it was .62. 
Fitting these findings together, it appears 
that both the neuroticism scale and the extra- 
verson scale are significantly related to agree- 
ing response set, but this relationship is not 
Strong enough to force the scales into corre- 
lation with each other, Because the reliabili- 
ties of all three measures are only middling, 
it is possible that both neuroticism and extra- 
version might turn out to be more strongly 
related to agreeing response set if the measure- 
ments could be made more accurately. 
Eysenck reported significant age and social 
class differences on the neuroticism scale, the 
lower class and younger age groups being 
slightly more unstable emotionally. In our 


1 The items were the 


19 items in Couch and Ken- 
niston’s Table 9 plus th 


sixth item in their Table 8. 


data the difference was in the same direction 
for age, and in the opposite direction for class- 
Both differences were small and neither was 
statistically significant. "m 

Eysenck also reported a significant differ 
ence on his extraversion scale betwen 
“drinkers” and “nondrinkers.” We did 
ask about drinking habits, but we did e 
about smoking. The smokers among our if 
spondents did not differ significantly from “a 
nonsmokers in either neuroticism or er 
version (Table 2). They did differ wm 
cantly on the agreeing response set diner 
To the extent that our sample is mp 
tive, women who smoke appear to be sligh 
more likely to say “yes.” 


SUMMARY 


Eysenck’s short neuroticism and extra 
sion scales were used in interviews bis i 
American housewives. The scales prove i 
be uncorrelated with each other, even thora 
both were significantly correlated with a ‘bli ty 
ure of agreeing response set. The RUN dé 
of the neuroticism scale was .72; and o Hee 
extraversion scale, .41. Some differences E 
tween the present results and Eysenck’s 
sults were discussed. 


REFERENCES 
Coucu, A, & Kenniston, K. Yeasayers rd 
sayers: Agreeing response set as: a Pd 1 
variable. J. abnorm. soc, Psychol., 1960, 60, 
174. eas 
Eysencx, H. J. A short questionnaire for m p 
urement of two dimensions of personality. J- 
Psychol., 1958, 42, 14-17. 


(Received August 24, 1960) 


| 
| 


Journal of Applied Psychology 
1961, Vol. 35. No. 37 279 206 


“REAL-LIFE” FAKING ON THE STRONG VOCATIONAL 
INTEREST BLANK BY SALES APPLICANTS 


WAYNE K. KIRCHNER 


Minnesota Mining and 


The Strong Vocational Interest Blank can 
be faked. Studies by Longstaf (1948), Geh- 
Man (1957), Benton and Kornhauser (1948), 
Bordin (1943), and Strong (1943), among 
Others, have shown that persons when directed 
to fake the SVIB can do so rather effectively. 

Ongstaff has shown that the Strong can be 
faked upward fairly easily. Gehman has shown 

at engineering students can look like social 
Service type persons if directed to do so. 
Benton and Kornhauser, too, have shown 
that medical interests can be faked. In gen- 
eral, then, “faking” can be accomplished on 
the SVIB if the subjects are asked to do this. 
.. All of these studies, however, seem lacking 
in “real-life” motivation. As far as can be 

etermined in these studies, the subjects all 
Were students who were asked to fake the 
test, This approach is, of course, worthwhile 
but it does not get at faking in more natural 
Situations such as selection. It would seem 
Worthwhile to investigate the responses of 
Persons not directed to fake but who are tak- 
ig the test under conditions where it 1s to 
their advantage to “look good.” The prime 
Xample of this of course, would be the job 
*Pplicant completing the SVIB as part of the 


electi 
on process. 
empt to throw some 


. his : 

[Sht on x dn el the SVIB in a real- 

Ue situation by analyzing responses made by 

Dr eS applicants (later hired) with those o 

SSently employed salesmen. 

METHOD 

Dart pf a lengthy follow-up rendi 

Sale, Of 1957 sales applicants who b 

St esmen in a large, midwestern company, 

teog Vocational Interest Blank data was 
le of 258 such ap- 


ey; 
bi, Wed for the total d completed the 
red as part of the 


A, 
1960 


Strats. These persons al 
Select® prior to being hi 


Manufacturing Company 


From this original group, two subgroups 
of 92 Retail applicants and 64 Industrial 
applicants, respectively, were obtained on the 
basis of later job duties. In effect, this pro- 
duced fairly “pure” groups of persons en- 
gaged strictly in Retail selling or strictly in 
Industrial selling. This division was deemed 
necessary for a myriad of studies, includ- 
ing those of Dunnette and Kirchner (1960), 
Witkin (1956), and Hughes and McNamara 
(1958), have shown that there are different 
kinds of salesmen with different interest pat- 
terns. For example, Dunnette and Kirchner 
have found that salesmen engaged in Retail 
selling are different from their Industrial 
counterparts on personality measures and on 
the SVIB, with Retail salesmen tending to be 
more like the traditional stereotype of the 
salesman. 

These differences could obscure or confound 
tendencies toward faking; hence, the split was 
made into the two basic categories of sales- 
men found in this company. 

Following this, SVIB occupational scale 
responses for the applicant groups were 
compared with those of presently employed 
company salesmen engaged in Retail or In- 
dustrial selling. These salesmen groups (Re- 
tail, N = 68; Industrial, N = 49) were part 
of a random sample of 196 salesmen from a 
total group of over 700 who took the SVIB 
and other tests on a volunteer basis as part 
of a concurrent validity study. Each sales- 
man in the two comparison groups had a 
minimum of 5 years’ sales experience in the 
company. The division in terms of job duties 
was accomplished by using the Sales Job De- 
scription Checklist (Dunnette & Kirchner, 
1958). Summing up then, SVIB responses for 
applicants who later became Retail and In- 
dustrial salesmen were compared with those 
of veteran Retail and Industrial salesmen. 

Scoring of the SVIB profiles was done by 
converting T scores on each occupational 


‘on process. 
273 


274 


Wayne K. Kirchner 


TABLE 1 


MEANS AND STANDARD DEVIATIONS ON 48 STRONG VOCATIONAL Interest BLANK SCALES FoR RETAIL AND 
INDUSTRIAL SALESMEN AND SaLES APPLICANTS 


Retail Industrial 
; 3 ifference 
Salesmen Applicants Mean Difference Spolivants Mean Differen 
( 2 (N= —————_ 


—  (Applicant- 


(Applicant: — 


M SD  Salesmen) 5 M SD “Salesmen’ 
3 
Artist 179 .77 136 .70 3.62 150 8 Lor 
Psychologist 1.74 87 — 1.99 “gy 1.85 2156 94 » 
138 1.23 82  .88 3.19 1.16 1.06 kt 
Physician 197 .86 — 160 “84 2.72 219 82 122 
Osteopath 2.99 98 265 .78 2.36 2.89 85 1106 
Dentist 1.84 .80 — 1167 3.93 1.72 .80 163 
Veterinarian 284 92 213 115 441 25  .76 : 
2,09 
Mathematician 50 86 33.59 81.83 53 Ti T10 
Physicist 28 .58 17 50 .67 1.02 4T BE 3.08 
Engineer 2.16 83 203 49 2.65 .92 216 .72 195 
Chemist 1.69 86 143 ‘82 2.06 .31 — 2.02 1.00 : 
24 
Production Manager 3.56 85 — 376 2 30  L58 3.6 76 3.77 87 5 
.60 
Farmer 316 .601 293 .72 -—23 222 3.27 69 — 3.20 49 —9 005 
Aviator 3.37 80 303 .80 2.66 3.74 78 344 78 =30 Ag 
Carpenter 2.45  .86 1.99 1.01 1.10 2.39 1.07 231 91 ) ad 
Printer 3.07 .81 313 .81 A5 3.28 96 — 348  .83 23 
Mathematics Physics 49 
Science Teacher 324 2 — 313 107 EST 79 3.53 .86 344 1.10 —09 — '$6 
Industrial Arts Teacher 144 L18 — 136 117 —.08 43 2.02 113 1.84 1.06 =.18 
V ocational Agriculture 45 
g Macher 2.66 .87 2.63 98 —.03 20 2.80 .97 2.88  .89 208 ET 
i iceman | 3.50 81 361 76 afi 87 3.55  .78 3.66  .78 H 1,51 
orest Service Man 2.56 .88 212 101 —44 294 26 (82 253 78 —.23 
- : 5 
YMCA Phys. Director 3.53 108 3.8 s; me 
à s 53 1. .81  .80 28 180 3.53 113 — 385  .86 E 65 
"esame — 12 $ fe 5 —— n [à US 9 S 
| 4 strator 81 .77 405 7s 2 1 4. 419.77 19 
YMCA Secretary 3:24 86 366 76 E: aoe B A ame 82 40 
Gig tate 3.69 .88 411 179 a2 312 355. ;88 — 398 95 3o ye 
City School Supt. 2.56 .77 238 80 1 i EE 7 : 25 f 
dies . 5 d K 53 d 2.78 .74 s ES 
Social Worker 3.62 91 337 36 s 10 ate Qu gos da 3:0 d$ 
Minister s : 1.76 3.76.96 as i 
240 .96 232 ‘81 250 1.53 248 .98 2.41 .83 dá " 
ici E 
Musician 282 2 286 8 Ot a 294 4 328 92 31} 
à ; T : P 
CPA 238 .71 2464 95 2 13 d 
Seni 3 E = [ -26 1.9; 2.5  .74 248  .87 1.45 
Senior CPA 346 185 — 386 89 db E 22 US 4o vw 2 ti 
Accountant 2.97 .76 3.69 84 32 5.66 3.08 .90 3.52 1.02 A gas 
Office Man 4.22  .72 442 65 20 181 4.00 .90 4.32  .57 320 59 
Purchasing Agent 3.63 .64 387 71 2i 2 3:30 gs 3.65 BM 326 — U35 
Banker 290 «71 SAS $5 .23 1.86 2.71 .76 291 Si 0 — "B8 
Mortician 429 .83 442 .62 A3 — 1,09 3.96 90 410 72 M got 
Pharmacist $84 .82 396 .67 42 .99 3.49 67 3.92 90 #3 " 
4 
Sales Manager 5.00.77 480 36 6 —.05 162 
s . z E —.22 2.19 4.71.67 4.66 — .4i ES 
Real Estate Salesman 404 80 471 40 — —23 218 4.67 .80 — 4,59 47 E NE 
Life Insurance Salesman 497 82 452 ‘60 —45 383 4.47 .901 433 .65 -44 P 
ó 
Advertising Man 3.99 .76 3.8 —.09 16 
3. i $84  .67 —15 130 3.82.75 — 373 M 1 
E l 3.00 69 304 69 04 36 296 164 298 69 020 70 
uthor-Journalist 2.82 61 2.65 69 —47 165 2.71 64 2.50 .66 —.21 1 
Pres.-Manufacturing Concern 3 25  .69 02 68: 
$ . à 338 gi A3 — 146 3.35 85 3.33  .63 4. 
5 . 
eut Maturity 5.15.58 — 4392 32 23 2595 5.27 148 4.04 13 3s En 
Masculine-Femini 525 63 4.71 45 54 — 6.02 $10 .79 471 44 : 1.2 
iculine-Feminine 4.34  .80 4.26 64 .08 .68 4.74  .80 4.58 — .53 16 


a k is th i 
€ mean difference between the two groups expressed in terms of the standard error of the difference: 


All & values of 1 


-96 or greater are statistically significant at the 05 levi 


Res Mi—Ms 


e os 
NUN. 


el of probability, 


Faking on the SVIB by Sales Applicants 


5 scale into an arbitrary eight-step scale (pri- 
| marily to avoid negatives) as follows: 
SVIB T SCORE 
(Occupational Scales) EIGHT-STEP SCALE 
-10 to +4 0 
5 to 14 1 
i 15 to 24 2 
25 to 34 3 
35 to 44 4 
45 to 54 > 
55 to 64 6 
65 to 74 7 
M-F, I-M, OL scales were converted from 


T scores to a seven-step scale as follows: 
10 to 19, 1; 20 to 29, 2 etc. 

Mean scores and standard deviations were 
computed for the four groups, Retail Sales- 
men, Retail Applicants, Industrial Salesmen, 
and Industrial Applicants, and mean differ- 
! ences and & values were also compiled. 

The critical assumption behind all these 
Comparisons was that presently employed 
salesmen with over 5 years’ experience had a 
very small "axe to grind? and were fairly 
honest in their answers while the applicants 
* Striving to get a job were more prone to look 
good. Thus, the applicant group was assumed 
to lean more toward faking and differences 
between the two groups on the SVIB were 
hypothesized to be the result of faking 


tendencies. 


RzsuLTS AND DISCUSSION 


Basic comparisons and results are shown 


In Table 1. . 
Two results are readily apparent. First, 
€te are many and consistent diae 

į tween SVIB scales scores of applicants an 

^lesmen. Of the 96 mean differences shown, 


H Of, 
l, ate. statistically significant at hu 5h 
evel of probability or better. Second, app 


i — in 
ations and Social Service jobs and lower 


i d 
Clenti chnical occupations an 
PIN Occupational Level 


er on Sales keys 0D 
a. 5 groups. These results suggest then that 
Dblicants who are assumed to be leaning 


i "ward on their SVIB responses to look good 
Dot too good at “upping” their scores on 


275 


the Sales keys. Instead, they “boost” their 
scores in Business, Social Service, and Person- 
nel areas plus showing a stronger professional 
orientation (higher OL). 

Why does this occur? The answer in this 
case seems to be that applicants in filling out 
the SVIB answer sheet are shying away from 
“dislike” answers. They tend to be giving 
more “like” and “indifferent”? answers. As 
Berdie (1943) has shown and which anyone 
can verify for himself by completing three 
SVIB answer sheets with all “likes,” all “in- 
differents," and all “dislikes,” like responses 
are associated with high scores in Social Serv- 
ice (Group V) activities and in Business De- 
tail jobs. Likewise, indifferent responses are 
helpful in getting higher Social Service scores. 
Dislike responses boost technical and scientific 
scores. 

One hypothesis that might explain this is 
that applicants try to answer in the most so- 
cially acceptable fashion. This could mean 
showing a liking for most things. Veteran 
salesmen, on the other hand, should not have 
as great a need to give socially acceptable 
answers and can “confess” they dislike a few 
things. 

In this vein, it is interesting to note that 
in a study by Dunnette, Kirchner, and De- 
Gidio (1958) the Strong area that correlated 
highest with the Good Impression scale on the 
California Psychological Inventory was the 
Social Service area. 

“Real-life” faking for sales applicants at 
least as shown in this study, then, boils down 
to showing a liking for many things which 
may be the socially desirable way to complete 
the SVIB. Unfortunately, this does not help 
the applicant much in boosting his Sales key 
scores but strongly affects other areas. It may 
reflect a naive kind of test taking behavior on 


the part of the applicant. 
SUMMARY 


Responses made on the Strong Vocational 
Interest Blank for 92 Retail and 64 Indus- 
trial sales applicants (later hired) as part of 
the selection procedure were compared with 
SVIB responses made by 68 Retail and 49 
Industrial salesmen employed at least 5 years 
who completed the SVIB voluntarily as part 
of a concurrent validity study. It was hy- 


276 Wayne K. 
pothesized applicants would be trying to 
“look good"; salesmen would be more “hon- 
est." Of 96 mean differences on the 48 scales, 
32 were significant at the .05 level. Applicants 
tended to be higher in both Retail and Indus- 
trial settings in Social Service and Business 
occupations and lower in Technical, Scientific, 
and surprisingly, Sales. Apparently, applicants 
indicate a greater liking for things than do 
employed salesmen, which suggests the idea of 
completing the SVIB in the most socially 
acceptable fashion: i.e., liking much, disliking 
little. 
REFERENCES 

Benton, A. Lọ, & Kornuauser, G. I. A study of 

"score faking” on a mechanical interest test. J. Ass. 

Amer. Med. Coll., 1948, 23, 57-60. 
Bernie, R. F. Likes, dislikes, and vocational interests. 

J. appl. Psychol., 1943, 27, 180-189. 


Bonprs, E. S. A theory of vocational interests as dy- 


namic phenomena. Educ. psychol. Measmt., 1943, 
3, 49-65. 


Kirchner 


DvxxzrrE, M. D., & Kircuner, W. K. Psychological 
test differences between industrial salesmen an 
retail salesmen. J. appl. Psychol., 1960, 44, 121- 
125. 

Dusnette, M. D., Krcuxer, W. K., & DeGIDO, J- 
Relations among scores in Edwards Personal Pref- 
erence Schedule, California Psychological Inventory, 
and Strong Vocational Interest Blank for an in- 
dustrial sample. J. appl. Psychol, 1958, 42, 178- 
181. 

Gemman, W. S. A study of ability to fake scores 0? 
the Strong Vocational Interest Blank for Men. 
Educ. psychol. Measmt., 1957, 17, 65-70. 

Hvcuzs, J. L., & McNamara, W. J. Limitations on 
the use of Strong sales keys for selection and coun- 
seling. J. appl. Psychol., 1958, 42, 93-96. 

LoxcstarF, H. P. Fakability of the Strong interes 
blank and the Kuder Preference Record. J. appt 
Psychol, 1948, 32, 360-369. j 

Stronc, E. K. Vocational interests of men a" 
women. Stanford: Stanford Univer. Press, 1943- 

WirkrN, A. A. Differential interest patterns in sales- 
men. J. appl. Psychol., 1956, 40, 338-340. 


(Received September 23, 1960) 


j 
| 


ee — — Qm. n — — ME E Sn 


Journal of Applied Psychology 
1961, Vol. 45, No. 47 277-280 


ORGANIZATION AND CREATIVE PROBLEM 
SOLVING 


NORMAN R. F. MAIER axb L. RICHARD HOFFMAN 


University of Michigan 


The Western Electric studies (Mayo, 1933; 
Roethlisberger & Dickson, 1939) were the 
first of an ever-increasing number which have 
pointed to the tremendous motivational po- 
tential which could be unleashed if organiza- 
tions attended to workers! needs. More re- 
cently social scientists and businessmen have 
suggested that people in lower levels of or- 
ganizations have problem solving capabilities 
which also have been relatively unused (Mc- 
Gregor, 1960; March & Simon, 1958; Worthy, 
1959). Because most companies follow tradi- 
tional organizational theory and practice, de- 
cision making authority is retained at higher 
levels of the organization. People at lower 
levels rarely are asked to make decisions and 
become accustomed to accepting orders from 
above. í 

At every level of management the superior 
frequently makes decisions about subordinate 
behavior which he feels are necessary to 
attain some organizational objective. Fre- 
quently, neither the subordinates’ feelings 
about the decision nor the subordinates’ in- 
formation, knowledge, or skills relevant to the 
decision are considered. The ignoring of sub- 
Ordinates’ feelings frequently results in a 
Poorly motivated acceptance of the decision. 

“ailing to consider the subordinates’ informa- 

tion, knowledge, and skills often results in 
Orcing subordinate acquiescence to a truly 
Poor decision. í 

People familiar with the operations of large- 

Cale organizations will have no difficulty re- 
alling instances where poor decisions have 
peen carried out because higher management 
‘sisted that they be done, even though the 
f, Ordinates knew the action was doomed to 
Allure, To what extent are organizations 
hich Operate under the classical management 


1 
Tes 
or ch grant (M-2704) f 
Stieg A Health, Unite 


This investigation was supported by a USPHS 
t roin. | rom the National Institute 
d States Public Health 


philosophy failing to use the human creative 
potential which is available to them? 

This paper reports one answer to this ques- 
tion by comparing the frequency of creative 
problem solutions obtained from groups which 
varied in their amount of experience and 
identification with existing organizations, es- 
pecially in business and industry. 


METHOD 
Problem 


The role playing case, the Change of Work Pro- 
cedure problem (Maier, 1955), was used to test for 
creative problem solving. Although the case is de- 
scribed completely in the cited reference, a summary 
of its more relevant characteristics will be given here. 

A foreman attempts in a group meeting to con- 
vince three workers to change from their present 
work method (Old solution) to a work method 
recommended by a time-study man (New solution). 
When the foreman decides “to take up the problem 
with the men,” a conflict arises between the economic 
advantages of each man working only on his best 
position, as against the relief from boredom pres- 
ently gained by rotating hourly among the three 
positions. In most groups the foremen try to convince 
the workers of the advantages of the New work 
method and, after the relative merits of the Old 
and New methods are discussed, the workers accept 
(most do) or reject the New method. Occasionally 
a group will break out of this choice situation and 
develop an alternative work method which both 
exploits the abilities of the workers to do different 
jobs and avoids the boredom resulting from working 
on a repetitive job all day. These solutions have 
been called Integrative and, on the basis of previous 
studies using this case, appear to be a valid index 
of the creative problem solving ability of a group. 
Integrative solutions have been produced more fre- 
quently: (a) by groups led by foremen trained in 
human relations as compared to untrained foremen 
(Maier, 1953), (b) by groups solving the problem 
a second time as compared to their first solution 
attempts (Maier & Hoffman, 1960), and (c) by 
groups composed of heterogeneous personalities as 
compared to groups of homogeneous personalities 
(Hoffman, 1958; Hoffman & Maier, 1961). The 
ability to turn a situation of choice between two 
work methods into a problem situation and develop 
a work method different from those that are obvious 
from the role instructions—a method which incor- 


277 


Norman R. F. Maier and L. Richard Hoffman 


TABLE 1 
DISTRIBUTIONS OF SOLUTIONS FROM GROUPS WITH DIFFERENT ORGANIZATIONAL IDENTIFICATION 


Type of Solution 


Old New Integrative pe 

Group N % N % N % N 70 
Employed* 19.5 283 415 — 6041 8 11.6 69 mo 
Business Administration students 2 yi 20 71.5 at 9 1002 
Psychology of Human Relations students 8 16.0 21 42.0 21 42.0 s mim 

Introductory Psychology students 4 12.5 13 40.6 15 46.9 | 


vote.—Chi e test of this relationship is significant at the .01 level of confidence, > ui : j 
eTe a e of ndasi foremen (11), airline managers (10), training directors (10), hospital managers (7) 


and nursing supervisors (31). 


porates the benefits of both methods—suggests a 
generally creative orientation to problems. The pro- 
portion of Integrative solutions produced by each 
set of experimental groups will be compared. 


Subjects 


Subjects from four different populations role- 
played the case. The populations were selected to 
represent differing degrees of experience and/or 
identification with vocational carcer in large-scale 
organizations, especially business or industrial or- 
ganizations. 

Organizationally Employed. Sixty-nine groups of 
people presently employed in a variety of large 
organizations represent the population with the 
greatest vocational commitment. Except for 10 groups 
of airline managers who role-played the case early 
in a one-week training program, all other groups 
solved the problem during one-day conferences fol- 
lowing an hour’s lecture on motivation, The re- 
maining 59 groups consisted of 11 groups of indus- 
trial foremen,? 10 groups of industrial training 
directors, 7 groups of members of the administration 
of a hospital, and 31 groups of nursing supervisors, 
Despite this diversity of organizational experience, 
the distributions of solutions from these several 
groups hardly differed at all. In addition to being 
currently employed, the members of these groups 
tended to be somewhat older than subjects from the 
other three populations. 

Business Committed. Junior, senior, and graduate 
students in personnel administration courses at the 
School of Business Administration provided 28 
groups. Since most of the graduates of this school 
follow careers in one of the major corporations, 
these subjects may be presumed to have made some 
vocational commitment to working in large scale 
organizations. In addition most subjects had had con- 
siderable part-time, and quite often full-time, work 
experience, and may be considered to be familiar 


* Allen R. Solem’s contribution of these d 
gratefully appreciated. 


*We thank Clayton Hill, 
James Taylor for allowing us 
ithis study. 


ata is 


Lee Danielson, and 
class time to conduct 


with the realities of organizational life. The case was 
administered as an exercise in the "leadership" sec- 
tion of the course, about two-thirds of the way 
through the semester. 

Human Relations Interested. Fifty groups were 
obtained írom students in two semesters of an 
undergraduate psychology course entitled Psychology 
of Human Relations. About half the students ve 
sophomore, juniors, and seniors in the literary colege, 
while the other half were enrolled in the vanos 
other schools of the university. Although the course a 
a first one in industrial psychology and a large p 
portion of the students had some work penera 
they were much less committed than were the pa: 
ness administration students to a business cud 
The problem was solved, in these «classes, SEDE 
mately midway in the semester, in connection W 
the topic of motivation. iio 

Introductory Psychology. Thirty-two groups us 
an undergraduate introductory psychology ear 
represent the population farthest removed from 
business scene. Subjects in these groups een rati 
cally, freshmen and sophomores from the 4 "o 
college, having little or no previous work experien? 


» of 'socit 
The case was role-played as an example of “s 
psychology applied to industry." 
RESULTS 
n ip 


The problem solving results, as show the 
Table 1, reveal a consistent decrease in es 
proportion of Integrative solutions as Ma 
compares the groups with the least degre 
identification and experience with business d 
groups with the most. The chi square T 5 
for this relationship is 21.90, significan 
the .01 level of confidence. f the 

In addition to the general trend O sue 
results, the patterns of solutions are pet 
gestive. The results obtained from or st 
in the two psychology courses are a ring 
indistinguishable, although slightly Heart 
groups in the introductory course. Sim! 


Organization and Creative Problem Solving 


the results from the presently employed and 
the business administration groups are very 
much alike. There is one important difference, 
however, between these two latter groups. A 
significantly higher proportion of Old solu- 
tions was obtained from the employed than 
from the business administration groups. 

A comparison of these two pairs of groups 
indicates an incisive break. Group-by-group 
comparisons show that the employed and the 
business administration groups as compared 
with the human relations and the introductory 
psychology groups produced significantly more 
New solutions and significantly fewer Inte- 
grative solutions.* 


DISCUSSION 


The results are straightforward, but many 
interpretations are possible. The possibility 
that subjects in the employed groups are less 
well educated than subjects in the student 
groups can probably be ruled out. A large 
number of the subjects in the employed 
groups probably had some college training. 
In any case, Maier (1953) has shown that 
even foremen of assembly-line operations are 
Capable of producing Integrative solutions 
When they are trained in the techniques of 
group discussion. Integrative solutions are 
easily achieved once the orientation is towards 
problem solving rather than making choices. 

Past experience with the Change of Work 
Procedure problem suggests the importance of 
the subjects’ orientation towards group situa- 
tions in terms of problem solving opportuni- 
ties rather than authority relations. The prob- 
em solving orientation seems conducive to 
arriving at Integrative solutions as both the 
‘Oreman and workers freely contribute their 
ideas about the work situation. Concern with 
Authority relations usually produces New or 

d solutions, as the workers accept or reject 
the foreman’s authority to force them to 
Change their work method. . 

On the basis of this interpretation, the re- 
Sls suggest that the human relations and 
the industrial psychology groups were more 
€n characterized by problem solving discus- 
is the lack 


portions of 
d and the 


of he only exception to this conclusion 

Noy Shificant difference between the pro 

itro Solutions produced by the employe: 
"Cory psychology groups. 


279 


sions, while the employed and the business 
administration groups evaluated the problem 
in terms of accepting or rejecting the fore- 
man's authority. 

What accounts for this difference? Do the 
experiences of working in business and other 
organizational settings becloud all problems 
with the authority relations involved? Are the 
penalties for “bucking” authority in tradi- 
tional organizations so great that the unac- 
ceptability of a foreman's suggestion can only 
be expressed by elaborating the merits of the 
status quo, rather than thinking of other 
alternatives? Do organizationally experienced 
people assume, when playing the foreman's 
role, that their suggested solution should be 
accepted by the workers by virtue of the 
formal authority the foreman holds? The at- 
titude of the business administration students 
appears to have confirmed this view. They 
were even less likely than the presently em- 
ployed groups to resist the foreman’s sugges- 
tion and, in fact, usually went along willingly 
with the new suggestion. 

The results of this study provide suggestive 
empirical support for the proposition that 
the usual formal authority structure found in 
present day organizations tends to inhibit the 
expression of the creative potential of their 
members. Groups with little or no identifica- 
tion and experience in business produced more 
than three times as large a proportion of In- 
tegrative solutions as did the groups of pres- 
ently employed people, who have more fa- 
miliarity with the background of the problem. 
If this effect were to hold true for other prob- 
lems, organizations are failing to use the 
creative capability that they possess in their 
ranks. 

The high proportion of acceptance of the 
New method by the groups of business ad- 
ministration students raises another question 
which may cause business some difficulty in 
achieving creative problem solving among its 
employees. The question is this: Are people 
being attracted to industry who are able to 
work comfortably in the formal authority 
system and are willing to accept decisions 
from their bosses because “that is the right 
thing to do”? If the answer to this question 
is “yes” and the current practice of hiring 
business school graduates continues at its pres- 


280 


ent rate, business may be hiring too many 
“yes men." The net result may be a still fur- 
ther decrease in the expression of creativity 
in industrial organizations. 

Awareness of the inhibiting effects on crea- 
tive problem solving of present-day adminis- 
trative practice is a first step toward more 
effective management. Adoption of the man- 
agement philosophy espoused by McGregor 
(1960) and Worthy (1959) and development 
of the attitudes and skills necessary to the im- 
plementation of this philosophy (Maier, 1952, 
1958) should follow. 


SUMMARY 


Groups from four populations differing in 
their amount of experience and identification 
with industrial vocation, were compared in 
their performances on the Change of Work 
Procedure problem. Arranged from most to 
least identified, there were 69 groups of people 
presently employed in large organizations, 28 
groups of business administration students, 50 
groups of students in a human relations 
Course, and 32 groups of students from an in- 
troductory psychology course. 

he percentages of Integrative—creative— 
solutions to the problem were 11.6% by the 
employed groups, 21.4% by the business ad- 
ministration groups, 42.0% by the human 
relations groups, and 46.9% by the introduc- 
tory psychology groups. These differences are 
Statistically significant at the .01 level of 
confidence. 


The results are interpreted as providing 


Norman R..F. Maier and L. Richard Hoffman 


support for the proposition that the formal 
authority relations in organizations inhibit 
creative problem solving. They also suggest 
that business may be attracting people who 
can work comfortably, but not creatively, in 
such formal authority systems. 


REFERENCES 


Horraan, L. R. Homogeneity of member personality 


and its effects on group problem solving. J. abnorm. — 


soc. Psychol., 1959, 58, 27-32. 

Horrman, L. R., & Mater, N. R. F. Quality and ac- 
ceptance of problem solutions by members of ho- 
mogeneous and heterogeneous groups. J. abnorm. 
soc. Psychol., 1961, 62, 401-407. , 

McGrecor, D. The human side of enterprise. New 
York: McGraw-Hill, 1960. 

Marr, N. R. F. Principles of human relations. New 
York: Wiley, 1952. fect 

Mater, N. R. F. An experimental test of the e a 
of training on discussion leadership. Hum. Relat., 
1953, 6, 161-173. < eth 

Mater, N. R. F. Psychology in industry. (2nd ed. 
Boston: Houghton Mifflin, 1955. No 

Marr, N. R. F. The appraisal interview: Object J 
methods and skills. New Yoik: Wiley, 1958. s 

Marr, N. R. F., & Horraan, L. R. Quality of ing. 
and second solutions in group problem solvi 
J. appl. Psychol., 1960, 44, 278-283. . New 

Marcu, J. G., & Simon, H. A. Organizations. 
York: Wiley, 1958. : iliza- 

Mayo, E. Human problems of an industrial civit?! 
tion. New York: Macmillan, 1933. P 

ROETHLISBERCER, F. J., & Dicxson, W. J. Mo 
ment and the worker. Cambridge: Harvard Uni 
Press, 1939. T 

Wonrnv, J. C. Big business and free men. New York 
Harper, 1959. 


(Received October 10, 1960) 


Journal of Applied Psychology 


Vor. 45, No. 5 


OCTOBER 1961 


PERSONAL HISTORY CORRELATES OF PHYSICAL 
SCIENTISTS' CAREER ASPIRATIONS 


LEWIS E. ALBRIGHT anv J. R. GLENNON? 


American 


In discussing the problems of managing a 
research and development laboratory, several 
writers (Bello, 1956; Miles & Vail, 1960; 
Torpey, 1960) have mentioned the trend to- 
ward providing two ladders of advancement 
for the physical scientist employed in large 
industrial laboratories. One of these lines of 
progression is the supervisory or adminis- 
trative route within the laboratory. The other 
consists of “bench” research positions of in- 
creasing complexity or responsibility, but re- 
quiring little or no administrative supervision 
of others. This latter hierarchy is intended to 
provide comparable “rewards” to productive 
researchers without diverting their efforts into 
administration. The rewards are, of course, 
larger salaries, more rank, impressive titles, 
and privileges similar to supervisory and man- 
agerial “extras.” 

Implicit in this modifi 
Structure is the suggestion o 
at least in career values and as 
tween research scientists and a 
Personnel. Common educational backgrounds, 
identical treatment during recruitment, and 
initial laboratory employment would tend to 
Mask such differences, however, and these 
Could be critical in subsequent performance. 

differences do exist, then selection and 
Placement techniques should be capable. of 
evelopment to assist in matching scientists 
and laboratory career choice. M 

One technique which has shown promise In 
ig similar area—engineering job aa 

Ne Strong Vocational Interest Blank. 5p 


hr The Writers wish to acknowledge the assistance 
Tn Wallace J. Smith of the Standard Oil od 
a) and William A. Owens of Purdue Uni- 
in the conduct of this study. 


ed organizational 
f basic differences, 
pirations, be- 
dministrative 


Oil Company 


cifically, Dunnette (1957) found that scoring . 


keys derived from the Strong discriminated 
among engineers working in different functions 
(pure research, sales, etc.). More recently, 
Kulberg and Owens (1960) showed that a 
number of life history items were differentially 
correlated with Dunnette's keys, which is not 
surprising since many “life history" items 
have to do with interests. But this finding 
does suggest that a personal history question- 
naire itself might prove to be a valid dis- 
criminator of supervisory- vs. research-ori- 
ented scientists. The present study is an 
attempt to investigate that possibility.. 


METHOD 


A detailed description of the personal history ques- 
tionnaire, subjects, and procedures of data collec- 
tion and analysis used in this study may be found 
in an earlier report (Smith, Albright, Glennon, & 
Owens, 1961). Briefly, a 484-item instrument cover- 
ing various background topics was administered to 
employed petroleum research scientists; item re- 
sponses were analyzed and cross-validated against 
three criteria of job success (supervisory ratings 
made separately on both over-all job performance 
and creativity, as well as the objective criterion of 
number of patent disclosures filed during a 5-year 
period). 

For the purpose of the present study, the subjects 
were grouped according to their responses to the 
first two alternatives of Question 179 of the per- 
sonal history questionnaire, which follows: 


179. Assuming a free choice which line of progres- 
sion would you prefer? 


1. Research assignments, such as, research as- 
sistant, research associate, etc. 

2. Supervisory or administrative assignments, 
such as, group leader, section leader, etc. 
3. Advancement outside the Research Depart- 

ment 
4. Any of the above 


282 


Initially, only those individuals were considered 
who had not yet achieved positions high enough to 
be in either of the lines of advancement described in 
Responses 1 and 2 above. Therefore, these subjects 
were termed research and administrative aspirants, 
respectively. There were 79 of the former and 62 of 
the latter. The groups were equated for length of 
company service by eliminating 11 of the longer 
service research aspirants, resulting in the total 
complement of both groups having hiring dates after 
World War II. The groups, as then constituted, were 
compared with respect to their amount of education, 
ratings on job performance and creativity, and num- 
ber of patent disclosures. 

Following these comparisons, each group was sub- 
divided into random halves and item analyses of the 
remaining 483 personal history items performed on 
each combination of research and administrative sub- 
samples separately. Items were retained which dis- 
criminated the administrative from the research as- 
pirants in both item analyses at the .10 level or 
beyond; on a compound probability basis this is 
equivalent to approximately the .05 level or less. 


RESULTS 


Comparability of aspirant groups on educa- 
tion and criterion variables. Inspection of data 
on the educational attainment of the research 
and administrative aspirants revealed no es- 
sential difference between the groups on this 
variable. Inasmuch as the subjects were volun- 
teers from the finite population of the labora- 
tory, the ¢ test proposed by Johnson (1949, 
p. 71) was used to test the significance of the 
difference between means of the groups’ per- 
formance and creativity ratings, as given in 
Table 1. The results show that the administra- 
tive aspirants were rated significantly higher 
than the research aspirants on both over-all 


? Because the performance ratings were based on 


Lewis E. Albright and J. R. Glennon 


performance and creativity.2 A median test 
(Walker & Lev, 1953) revealed that the me- 
dian difference of six patent disclosures in 
favor of the research aspirants, also shown in 
Table 1, was significant. (The variation in Ns 
appearing in Table 1 is due to missing cri- 
terion data.) 

Personal history analysis. The item analyses 
of the personal history questionnaire against 
the research-administrative dichotomy estab- 
lished from Item 179 revealed a total of 43 
items which discriminated the aspirant groups 
at the required significance level. For illus- 
trative purposes, three of these are given be- 
low, the “A” and “R” designating the group; 
administrative or research, making the indi- 
cated response more frequently than the other 
group. 


133. How much time, on the average, do you mp 
reading technical or professional journa^* 
magazines or books? 


1. None 

2. Less than 1 hour a week 
A. 3. 1 to 3 hours a week 

4. 4 to 7 hours a week 

5. More than 7 hours a week 


r e 
150. Would your choice of an ideal job be on 


which: 
r i ith 
A. 1. Allowed a great amount of interaction wil 
other people E 
2. Would require working with a small pe 
R. 3. Would allow you to work closely wit 
other person 


R. 4. Would allow you to work by yourself 


: ilit 
436. Assuming you had sufficient musical ability 


i ying €^ 
and training to perform in the following 


of 
highest rating and 7 the lowest, the larger mean 


ine 
a seven-step forced distribution system with 1 as the — 4.47 for research aspirants is in fact a poorer rating 
TABLE 1 
COMPARISONS OF RESEARCH AND ADMINISTRATIVE ASPIRANTS ON PERFORMANCE AND CREATIVITY 
RATINGS AND PATENT DISCLOSURE RECORDS 
Variable Research Aspirants Administrative Aspirants Significance 
N M SD N M SD Tet df ^P 
Over-all Performance Rating 68 4.47 1.46 62 403 1.28 1552/42» 128 a 
Creativity Rating 58 —2.91 85.96 55 2485 7837 12200 111 € ai 
No. of Patent Disclosures 42 — 700 15.11 35 1.00 10.02 naag L G 


? These values are medians, others are means. 


èt tests for significance of difference between means from a small finite population (Johnson. 1949, p. 71). 


i 


Scientists Career. Aspirations 


pacities, which one do you believe would give 


you the greatest personal satisfaction: 


1. Soloist—instrumental or vocal 
2. Composer 
3. Conductor 
4. Arranger 
5. Member of orchestra or choral group— 
not soloist 
6. Critic 
Although many of the significant items are 
more concerned with factual biographical in- 
formation than are these three, there is a sur- 
prising confirmation of the stereotype of the 
research scientist as a somewhat introverted, 
work-oriented person vs. the more outgoing, 
status seeking, administrator. 


BE 


VALIDITY GENERALIZATIONS 


It is necessary, for prediction purposes, to 
verify these findings on independent samples 
because of the "internal" method of item 
validation used. Consequently, the records of 
65 individuals not involved in the previous 
analyses were selected for study. Fifty-three 
of these subjects were either group leaders or 
Section leaders, the remaining 12 being re- 
Search associates or their assistants. Refer- 
ence to Question 179 shows that their job 
titles placed these subjects in one of the two 
lines of advancement mentioned in Responses 
1 and 2 and hence, are called administrative 
and research incumbents, respectively, in the 
remainder of this paper. Any incumbent not 
indicating a preference, on Question 179, for 


the line of advancement in which his job title 


placed him was held out for separate analysis. 
terion variables 


Examination of the three ct! 
for the incumbent groups showed that. the 
trends noted earlier for the aspirants remained 
€ same, i.e., administrators were rated Fs 
on performance and creativity but had fewer 
Patent disclosures than the researchers. (Sig- 
nificance tests were not employed in this case 
ecause of limited data on one or more cH- 
eria.) 
me validity of the 43 pre 
incumbent: by uec of a simple scoring scheme. 
RC incumbent, regardless of actual job title, 
res arbitrarily given a score of + 1 for per 
pic Donse characteristic of the researe 
"ants and 4 — 1 for an administrative type 


viously significant 
vestigated for the 


283 


TABLE 2 


Scores Or RESEARCH AND ADMINISTRATIVE 
INCUMBENTS ON 43-ITEM PERSONAL 


History KEY 
Research Administrative 
Incumbents Incumbents 
Total Score (N = 12) (N = 53) 
15-16 x 
13-14 
11-12 Ps 
9-10 XX 
7-8 XXX 
5-6 x 
3-4 x 
1-2 x 
—1-0 XX 
—3-(-)2 
—5=(=)4 
-7-(-)6 
(38 
—11-(—)10 
—13-(—)12 
M 6.42 
SD 4.70 


response. Table 2 shows the distribution of 
total scores on this scoring key. The means of 
6.42 for researchers and — 1.72 for adminis- 
trators are in the expected direction and differ 
significantly (¢ = 5.32 with 63 df, p < .01). 

An additional bit of evidence of the validity 
of the key was obtained by applying the scor- 
ing system to the papers of six administrative 
incumbents whose responses to Question 179 
indicated they preferred the research line of 
advancement. Their mean score of 3.33, 
roughly midway between those of the other 
two groups, makes sense intuitively since these 
individuals see themselves as misplaced on 
their present jobs. 

A second validity generalization study was 
carried out with individuals one or more levels 
higher in the organization than those utilized 
heretofore. Specifically, the personal history 
questionnaires were scored of five senior re- 
search associates and 12 division directors. 
These jobs are virtually the highest ones in 
the research and administrative hierarchies, re- 
spectively. (Obviously, the pyramidal shape of 
the organization seriously limits the number 
of such positions.) The responses of these 
groups Were available on only 18 of the 43 
discriminating items, but the means of .80 for 
the researchers and — 1.50 for the adminis- 
trators follow the pattern previously estab- 


284 


lished. The small numbers of subjects pre- 
clude the meaningful application of any 
statistical test of significance; it would seem, 
however, that in confirming the trend, this 
finding is more than merely suggestive. 


Discussion 


The process of applying a scoring key de- 
veloped at one level of the organization to 
successively higher echelons for validity gen- 
eralization purposes is somewhat novel in the 
writers’ experience. It seems reasonable to 
believe, however, that differences in personal 
history which can be discriminated at the 
entry level should persist as people move up, 
although such differences will be increasingly 
difficult to demonstrate due to smaller Ns, 
greater remoteness of early experience for the 
older person, etc. How fruitful the method 
would prove in closely allied measurement 
areas such as interest and personality test de- 
velopment is questionable. With relatively 
homogeneous groups it would seem to offer 
real promise. But among managers in general, 
for example, the work of Porter and Ghiselli 
(1957), using a self-description adjective 
checklist, suggests that differences between 
management levels might mask within-level 
differences. 

Besides their importance in job placement, 
significant personal history differences be- 
tween industrial researchers and their ad- 
ministrator colleagues could have implications 
for other features of the laboratory climate. 
For example, it is the administrators who, by 
nature of their jobs, determine the perform- 
ance ratings, salary adjustments, promotional 
prospects, and the like. They or their sub- 
ordinates do much of the college recruiting. 
In short, biases toward favoring one's own 
kind, consciously or unconsciously practiced, 
could do a great deal of harm to the organi- 
zational careers of research scientists and, of 


Lewis E. Albright and J. R. Glennon 


course, serious disservice to the. research 
achievements of the laboratory. 


SUMMARY 


A total of 43 personal history items were 
found to differentiate employed petroleum res 
search scientists desiring to advance in the 
laboratory supervisory hierarchy from a Y 
lar group aspiring to increased salary an 

AE ) 

status while remaining “at the bench.’ For 

s rli 2 
purposes of validity generalization a scoring 
key composed of the discriminating items was 
applied successively to groups of scientists 
working at the middle and upper levels in the 
administrative and technical lines of advance 
ment. The key retained its Füscriutum e 
power (though to a lesser extent) at each 
higher echelon of the laboratory Grgantzetio’ 
Some implications of this result for personne, 
placement and management of an piscem 
research and development function were €! 
cussed. 

REFERENCES 

" "c ne. 

BELLO, F. Industrial research: Geniuses now welcome 

Fortune, 1956, 53(1), 96-99, 142-150. ! " 
DUNNETTE, M. D. Vocational interest meran 

among engineers employed in different functi 

J. appl. Psychol., 1957, 41, 273-278. "n 
Jounson, P. O. Statistical methods in research. ^ 

York: Prentice-Hall, 1949, ste history 
Kurserc, G. E, & Owens, W. A. Some life o 

antecedents of engineering interests. J. educ. ^9) 

chol., 1960, 51, 26-31. Sus head: 
Mis, S. B, Jn, & Var, T. E. Thinking s 

Dual management. Harvard bus. Rev., 1960, 

27, 30, 149-150, 152-154. ge 
Porter, L. W., & Gmrsrrrr E. E. The self il 

tions of top and middle management pers 

Personnel Psychol., 1957, 10, 397-406. &,& 
Situ, W. J, Arsricut, L. E., GLENNON, r modis 

Owens, W. A. The prediction of ipe eri 3 

petence and creativity from personal histo 

appl. Psychol., 1961, 45, 59-62. . 
Torrey, W. G. Conserving our technologica 

power. Personnel, 1960, 37(2), 61-67. m 
Warrer, Heren ML, & Lev, J. Statistical infe 

New York: Holt, 1953. 


(Received December 14, 1960) 


] man- 


Journal of Applied Psychology 
1961, Vol. 45, No. 5, 285-288 


SIMPLE FORMULA AIDS FOR UNDERSTANDING 


THE JOINT ACTION 


OF TWO PREDICTORS}? 


MELVIN R. MARKS 
Psychological Research Associates, Arlington, Virginia 
RAYMOND E. CHRISTAL, ann ROBERT A. BOTTENBERG 


Personnel Laboratory, Wright Air Development Division 


Suppose you had the problem of predicting 
academic success in a college course. You try 
two predictors in a large sample and find that 
one has a validity of .10 and the other has a 
validity of .00. Furthermore, you determine 
that these two predictors correlate .99 with 
each other. What would you conclude? 

If your conclusion is that you should throw 
away the two predictors and start over again, 
then we suggest you read this article in its 
entirety. If, on the other hand, you are not 
surprised to learn that these two predictors 
will produce a multiple correlation of .707, 
then you might wish to skip to the formula 
derivations in the last section of this article. 

It all started when, in actual research, a 
multiple R of .544 for two predictors was com- 
puted. Now there is nothing unusual about a 
multiple correlation validity coefficient of .544, 
except in this case the validity coefficients for 
the two predictors were 060 and .168, re- 
Spectively. At first, the results were de 
Honed—especially when it was noted that the 
Correlation between the two predictors was 
‘979, The investigator was familiar with e 
Called "suppressor effects," but he had not ha 4 
Occasion to see them in action to this extent. 
He found himself checking the results on a 
Calculator. A mistake had not been made, 

, When the finding was discussed with p 
Clates, their reactions were interesting. The 


Statisticians took the attitude of “T told you 


per was sponsored 
right Air Develop- 
ct 7734. 
bout the term and 
effects.” The fac- 
ly as the number of 
{is not completely 
two predictors. For 
al research" men- 
posite and Ve 
difficult to 
hat in this 


pal 
W 


Te The research reported in this 
meat by Personnel Laboratory, 
ent ivision, under ARDC, Proje 

The authors have some doubts a 
ial explanation of "suppresso 


T]; s] H 
3^ explanation breaks down rape 
o, anc 


ed © in the problem 
e on this paper, V, is an al 
yl; dare of that same comp 


ain y ; ? 
Wha presset 
ance. t is being supl 


J out of wl 


so,” for they had been warning researchers 
against forecasting multiple correlations from 
inspection of the predictor matrix. The atti- 
tude of most investigators, on the other hand, 
was one of “Well, I'll be damned." They 
realized they would not have predicted these 
results from inspection of the data. In fact, 
if they had come across two experimental 
variables with such a high intercorrelation, 
they would have discarded one without hesita- 
tion—especially if neither predictor had much 
relationship with the criterion. 

We thought the reader might enjoy, and 
perhaps- profit from some experiences the 
authors had while studying the effect of the 
intercorrelation term on the two-predictor, 
multiple correlation coefficient. Before con- 
tinuing, let us adopt the following notation: 


V; — validity coefficient of Predictor 1 
V» — validity coefficient of Predictor 2 
r — correlation between Predictor 1 and 
Predictor 2 
R — multiple correlation 


The first and primary point is that for any 
fixed pair of validity coefficients, there must 
be a value for the intercorrelation term which 
will produce a minimum, multiple correlation 
coefficient. Applying a little calculus to the 
basic two-predictor formula, the authors were 
pleasantly surprised to find that this value 
turns out to be simply the ratio of the two 
validity coefficients. That is: 


= value of the intercorrelation term for a 
fixed pair of validity coefficients in the 
two-predictor equation which results 
in a minimum R. Select the ratio which 
results in a value of 1.00 or less. 


285 


286 


Stated in other words, for the three-variable 
case, when the correlation between predictors 
is identical to the ratio of the predictor validity 
coefficients (smallest coefficient in the numera- 
tor), the multiple R is a minimum. Any value 
of the intercorrelation different from this ratio 
will yield a larger multiple R. 

What is the value of this minimum R? With 
a little thought it becomes obvious that the 
minimum R always will be exactly equal to the 
larger of the two validity coefficients. In other 
words, the ratio of the two validity coefficients 
will locate for us an intercorrelation value 
such that the variable having the lowest valid- 
ity will add nothing to the variable having the 
highest validity in predicting the criterion. 
"Thus, when the intercorrelation is equal to the 
ratio of the validity coefficients, there is no 
increase in predictive efficiency by employing 
the variable with the lowest validity. 

Since the ratio V/V» seems to be all-im- 
portant for the effect of the intercorrelation 
on the magnitude of multiple R, it will be of 
interest to examine the results for the cases 
where Vi = Vs, where V; V. (neither V, 
nor V, is zero), and where V; or V» (but not 
both) is equal to zero. Note that the first and 
third cases are special, in that they rarely 
occur. 

In the first case, where Vi = Vo, the ratio 
Vi/Ves is exactly 1.00. If two validity coeffi- 
cients are equal, then, the smallest multiple 
correlation will be obtained when the two 
predictors are perfectly related. Any other 
value for the intercorrelation term will result 
in a higher multiple R. 

"So what?" one might ask. Most of us have 
been taught that, in general, the lower the 
relationship between two predictors, the higher 
the resulting multiple. Perhaps the best answer 
is provided by a look at Case 3, where V; or Vs 
(but not both) is equal to zero. In this instance 
Vi/Vs is .00, and the lowest multiple R will 
be computed when the two predictors corre- 
late zero. Thus, the mirror image of Case 1 
exists, and it can be shown (see Table 1) that 
the higher the correlation of the predictors, the 
higher will be the resulting multiple R. 

Both Case 1 and Case 3 are special cases. 
The more usual situation is Case 2, where 

Vi V2 and neither is zero. For example, let 
us assume that V; is .50 and V» is .10. In this 


M. R. Marks, R. E. Christal, and R. A. Bottenberg 


instance, V2/V is .20. If the correlation of 
Predictor 1 with Predictor 2 were .20, then 
Predictor 2 would add nothing to Predictor 1, 
and the resulting multiple R would be exactly 
.50. If the correlation of Predictor 1 with 
Predictor 2 were different from .20, in either 
direction, the resulting R would be greater 
than .50. e 

This simple formula could be especialy 
helpful to investigators who lack ready er. 
to computing facilities. Suppose, for examp!® 
one had a matrix containing the intercorrela- 
tions and validities for an operational selection 
composite and certain experimental variable, 
He could simply take the ratio of each of the 
experimental variables, in turn, to the validity 
coefficient of the selection composite. By d 
paring these ratios to the observed intercorre a 
tion he could determine if any one of be 
experimental variables could add predict 
variance to the selection composite. T he gm 
the discrepancy between the ratio and EE 
intercorrelation, the greater would be ie 
expected contribution of the experimen te 
variable to the composite. It should also ne 
noted that situations may exist where And 
might make greater progress toward improving 
prediction by increasing the correlation a 
tween predictors (validities being held c m 
stant) rather than decreasing such correlati 

A formula to determine the values for i 
correlation between two predictors which M 
lead to a maximum R turns out to be V 
simple: 


San E. 
Vile NO — vid — Ve) 


à een 
= two values for the correlation betW* 


t 
two predictors (with fixed Men 
coefficients) which will grant ™ 
R of 1.00. 

hen 
From applying this formula, we see that, WP 
two predictors have validity cockeri Te 2 
and .10, respectively, they will pro v 
multiple correlation of 1.00 if their correla 
with each other is either .91 or —.81. . 
A little more understanding of the 1n' 
of the correlation between two predicto 
the resulting R can be obtained by inspe 
Table 1. This table shows the fallacy ° (a 
widely accepted opinion that in order p 
high multiple correlation we shoul 


on 


fuent? 

e 

cti 
the 


or 


Formula Aids for Use with Two Predictors 


TABLE 1 


EFFECT OF THE CORRELATION (r) BETWEEN Two PREDICTORS ON THE MAGNITUDE OF THE MULTIPLE 
CORRELATION COEFFICIENT FOR FIXED VALIDITIES V; AND V2 


Fixed Pair of Validity Coefficients 


00 10 50 00 24 36 21 
50 E 50 ‘80 ‘80 160 30 
—.86 98 96 
— 84 92 90 
82 87 85 
EU 83 97 81 
—.10 70 80 .66 
—.60 .62 A 1.00 .57 
—.50 58 64 1.00 92 97 51 
— 40 55 .60 91 87 89 AT 
— 30 52 36 85 m 95 82 43 
—.20 E E 79 82 90 IT al 
= f .50 E! 45 .80 .86 73 38 
‘00 50 E n .80 84 70 37 
10 50 50 67 80 82 67 35 
20 51 50 64 82 80 64 34 
30 ED ED 62 E 80 .63 33 
40 55 St .60 87 80 61 32 
i 5 53 58 92 82 .60 En 
50 58 E 
‘60 62 56 56 1.00 85 .60 30 
x 9 
70 70 61 ED 92 61 30 
: 63 30 
83 rji 53 
e 87 14 52 64 31 
82 b 52 65 391 
7 
84 22 A 67 31 
86 98 .82 52 E E 
È z 87 52 69 32 
88 5 51 73 33 
90 51 4 34 
92 51 85 37 
94 P 98 41 
25 .50 52 
98 eo E 
99 $ 
A 1.000 .000 .300 .600 .700 
r for minimum R ooo —— 20 500.600 390 —.530 so 
"*lforng 2100, —.866 012 = 600 JA 962 996 
"#2 for R = 1.00 866 E 


high correlation between each of the aret 
ent variables and the dependent predictor 
v. 4 a low correlation between sg pre eces= 
sables, That this viewpoint 1S ve que 
Sly correct was demonstrated Wed i ilus- 
ta orst (undated). Although Nu h es as 
ates the manner in which tlie 2 chang few 
Predictor intercorrelation varies for * 


the 
Cted pairs of validities, and the formulas 


Sele, 


presented above provide a ready answer to the 
question of the values of which minimize and 
maximize R in a two-predictor case, the reader 
should recognize that the only sure way of 
determining the extent to which a predictor 
will increase the predictive efficiency of a set of 
ctors is to compute the multiple correla- 


predi i h k 
both with and without the predictor in 


tion 
question. 


288 


FORMULA DERIVATIONS 


1. To solve for the value of r which mini- 
mizes R for a fixed pair, V; and Vs, differen- 
tiate the following expression with respect to 
7; then set the derivative equal to zero and 
solve the resulting equation for v. 


Vè +V 2V Vr 
Leg 


(1 — r)(— 2V V3) 


R= 


dRe — (V? +V? — 2V Var) (— 2r) 
dr — -rE 
— Wier 2V Ver — Wis 
(1 — ry 


Setting the derivative equal to zero and multi- 
plying through by — (1 — 7?) we obtain: 


2ViVar + 2(Vit d- Ver + Wis = 0 
Solving for the roots of the equation, we obtain: 
p = XU? VE) x VAEV — 16(V2Vg) 

AViV» 


z Xy T VE) xO(VS-— Vg) 
AViVs 


r= VV, 


V/V, [1] 


2. To obtain values of r which provide a 
multiple R of 1.00 when two predictors with 
fixed validity coefficients are least-squares 
weighted, solve the following equation for r: 


Ve + Vg — 2V Vr e 
1-r v 


This may be expressed as: 


7? —2ViVw TVéEVP2R—1-20 


M. R. Marks, R. E. Christal, and R. A. Bottenberg 


therefore, solving for r: 


Wie VAV eV? — A(Ve - Ve — 1) 
9 


and simplifying: 
r= ViV2 zz va- raa- vs [2] 


Restriction 1: When V; = V», only one root 
of Formula 2 is applicable and it reduces t 
the expression 2V? — 1. . 

Restriction 2: Formula 2 is not applicable 
when V, = V» = 0. This is the only fixed pat 
of validity coefficients for which no value of r 
will grant a multiple R of 1.00. 


SUMMARY 


It was demonstrated that the ratio of any 
two fixed validity coefficients will locate * 
value for the intercorrelation term in the d 
predictor, multiple correlation formula suc 
that the variable having the lower validity pe 
contribute nothing to the variable having t A 
higher validity. The resulting multiple cor” 
lation is thereby a minimum. 

A second formula is presented that 
termines two intercorrelation values for a 
fixed pair of validity coefficients which ais 
result in a multiple correlation of 1.00. ^ à 
second formula does not hold when bo is 
validity coefficients are exactly zero, and - 
restricted to determination of a single ced 
correlation value when the two validity coel 
cients are exactly equal and not zero. 


de- 


REFERENCE 


= i " 

Honsr, P. The role of prediction variables yi 
independent of the criterion. In Social Stet meni 
search Council, Committee on Social Ad SSR 
Prediction of personal adjustment. New York: 
undated. Pp. 431-436. 


hich are 
e Re 


(Early publication received March 9, 1961) 


Ee A (0 m 0 0 a E = = 


Journal of Applied Psychology 
1961, Vol. 45, No. 5, 289-294 


A FACTOR ANALYSIS OF JOB ACTIVITIES’ 


GEORGE J. PALMER, JR. 


Louisiana State University 


Over a period of years considerable prog- 
ress has been made in the identification of 
the nature of different human abilities and 
characteristics. In the area of mental abilities, 
for example, the classic work of Thurstone 
(1947) is, of course, generally considered a 
milestone. In the domain of motor skills, 
some of the studies by Fleishman (1960) 
have helped to crystallize the nature of these 
human dimensions. 

While some human attributes have thus 
become somewhat clarified, there has been 
relatively little effort given to the identifica- 
tion of what might be thought of as the 
dimensions of jobs. Viewing human work and 
its many variations, it is reasonable to be- 
lieve that there might be certain basic di- 
mensions of work activity which conceivably 
could be identified and described. A few stud- 
ies have tended to confirm this hypothesis. 
Among such studies are those by Coombs 
and Satter (1949), Hemphill (1959), Bes- 
hard (1954), and McCormick, Finn, and 
Scheips (1957). 2l " 

In approaching the domain of human work, 
however, one could take at least two different 
Points of view. First, one could consider the 
description of work primarily in terms of 
Job characteristics. Such descriptions might 
emphasize the technological aspects of jobs, 
Oor they might emphasize what workers E 
Complish in their jobs (such as baking, ani 
ing, painting, and so forth). Descriptions = 
*rms of technology or in terms D ni 

lishments might be thought of as "Jot m 
nte» descriptions of the work activities 

ld view 
People. On the other hand, one cou à 
“man work in terms of what workers do 


i ir j ¢., visual, manual, 
o, Performing their jobs, €-8 "These "might be 


r "ECT 
Communications activities. ight 
oj ght of as “worker eq wr ts 
i e 
Wh es, i.e., how e 
a b deae Aoa ichieve certain 


= ; ac 
nq, “nological factors to 

Vp oo sf the 
by, Chis Study was supported bj; dun HOS 
"* Research Foundation. 


AND Ernest J. McConMicxK 


Purdue University 


The distinction between the “job ori- 
ented” and “worker oriented” approaches is 
not always clear cut; it is largely a matter 
of emphasis. In general, perhaps the differ- 
ences can be summarized by saying that the 
“job oriented” view is likely to place more 
emphasis on the conditions and results of 
work, while the “worker oriented” view tends 
to place more emphasis on worker activities 
as such. 

This study was one that was viewed as a 
“preliminary” and “probing” effort in this 
direction. In particular, it was the plan to 
develop a check list of job activities of a 
“worker oriented" nature, to use these in 
describing a sample of jobs, and to subject 
the results to factor analysis to attempt to 
identify the worker activity “dimensions” of 
the jobs in question. In this exploratory 
effort it was realized that the experimental 
check list of work activities that might be de- 
veloped would be only something of a “first 
approximation" of what might ultimately be- 
come a more precise device. It was further 
realized that, with what would be a restricted 
sample of jobs, the resulting “dimensions” 
would have to be viewed in the light of the 
sample of jobs in question and, therefore, 
would not necessarily have general applicabil- 
ity. In other words, this was viewed pri- 
marily as a methodological study to find out 
if it might be profitable to explore this rela- 
tively virgin territory. 


PROCEDURE 


Job Activities Check List 


A 177-item check list, describing job activities in 
terms of worker behaviors, was organized generally 
along lines suggested by information theory no- 
tions, such as input, decision, storage, and output. 
The major parts of the check list and brief descrip- 
tion of their contents are presented below. Most of 
the examples given are the identifying stems from 
items that were more fully defined and illustrated 
in the check list proper. 

1. Information-Receiving Activities (35 items rela- 
tive to: use of the senses to obtain information to 


289 


290 


carry out work activities—íor example, identifies or 
distinguishes sounds by pitch or tone) 

2. Mental Activities (43 items relative to: de- 
cision making activities; uses of various kinds of 
knowledge, including general and specific vocational 
educational development—íor example, evaluates 
performance of people) 

3. Supervisory and Communications Activities (33 
items relative to: responsibility for supervision; 
oral, written, and signal communications; interper- 
sonal contacts involved in these activities—for ex- 
ample, supervises work group) 

4. Manual Activities (25 items relative to: use 
of hands to operate controls, vehicles, tools, or 
equipment; direct use of hands to modify materials 
—for example, operates keyboard device, le. type- 
writer) 

5. General Bodily Activities (24 items relating to: 
use of the whole body in simple or coordinated 
movements—for example, climbs) 

6. General Work Conditions (7 items relating to: 
conditions in the physical work environment—for 
example, noise level) 

7. General Job Characteristics (10 items relating 
to: ratings of the job as a whole on general worker 
attributes—for example, adaptability to routine) 


Sample of Jobs 


A job sample was selected to represent the job 
structure of a large steel Producing firm. Of an 
estimated 10,000 jobs in the company, job descrip- 
tions were available for about 5,000, from which the 
sample of 250 was selected. 

The composition of the sample was as follows: 
Exempt salaried jobs (operating, administrative, 
technical, and related jobs), 40%, were described 
in terms of purpose, function, skills, knowledge, and 
responsibility. Nonexempt clerical jobs, 10%, were 
described in terms of tasks and time distribution, 
Hourly paid jobs (steel production and processing, 
maintenance, inspection, and related jobs), 50%, 
were described concisely in terms of functions, edu- 
cation, supervision, and other conditions, as given 
by the Cooperative Wage Bureau (1953) for steel 
industry jobs. 

It should be noted that the objects of study were 
jobs, not personnel, nor positions (personnel-in- 
jobs). The jobs were not sampled in proportion to 
numbers of incumbents, so that the present results 


cannot be interpreted in terms of importance for 
employed personnel. 


Analysis of Jobs 


Data and Method. The basic information about 
the jobs in the sample was contained in written 
job descriptions. Each job in the sample was 
analyzed on the basis of the information in its 
descriptions, in terms of the item of the check list. 
One of the investigators analyzed each job by read- 
ing its description, indicating on the check list the 


George J. Palmer, Jr. and Ernest J. McCormick 


presence or degree oí each job activity which he 
considered to be pertinent to the job in question. 

Reliability of Analysis. The reliability of analysis 
was estimated from the degree of agreement be- 
tween competent analysts for a sample of 32 jobs. 
Eight graduate students in industrial psychology 
made independent analyses of 2 to 5 jobs apiece, 
and their analyses were compared with those of the 
investigator. The index of agreement for dichotomous 
items was the coefficient of overlap, Peters and Van 
Voorhis (1940), Formula 61. For continuously 
scored items, the index was the Pearson product- 
moment correlation. 

Reliability data were based on 130 items from 
the total list. Some items were eliminated because 
they were seldom or never checked. Thirty-two 
items were eliminated because the overlap C07 
efficient was indeterminate, a result of a zero de- 
nominator. d 

The median coeífcient of agreement was .75 EU 
only 7% were below .41, indicating a reasonably 
satisfactory degree of argument among the suae 
in selecting the check list items which they co" 
sidered to be pertinent to each job. 


Methods oj Data Analysis 


The check list data for the sample jobs p 
correlated and subjected to factor analyses in 
stages. 

pem Factor Analyses. The first factor analy 
were based on the correlations of check E RS 
ables within each of five groups of variables. d 
groups consisted essentially of the items in e ae 
1 through 5 of the check list. Items were A ai 
dichotomously and were intercorrelated be by 
index developed specifically for check list SA are 
Ben J. Winer, Purdue University.? Items pi EA 
Checked less than 13 times were not idy ons 
the analysis, so that the final matrices for e Ea 
1 through 5 contained from 21 to 31 items. itiple 
matrix was factored separately by the mu The 
group method as given by Thurstone Goto E 
highest coefficients in the columns or rows o 
group were used as communality estimates. the 

Factor Scores. For purposes of estima URE MT 
factors, simple integral weights ranging ws sign 
to+3 were assigned to items according to E 
and magnitude of their factor loadings. p 
factor variable, each job's score was the ae ould 
sum of weighted items checked for it. It ultiple 
have been possible to develop a complete nt as 
regression equation to estimate each “factor, 


4T 


ere 


?For a 2 X 2 table, a|b/c|d, where a is er 
cell and letters are cell frequencies, this index Í 


12 [p 3/20 - 9l 
E atb+e . 


" insid! 
This coefficient takes the sign of the difference 1 tit 
the absolute value bars. It may be noted, t00, 
ignores Cell d and gives equal weight to b and c. 


Factor Analysis of Job Activities 


Guilford (1954), for one, points out, the gains are 
small for the labor. 

Final Factor Analysis. For purposes of integrating 
the factors obtained from parts of the check list, 
a sixth factor analysis was carried out. Factor scores 
which had been developed for the 24 factors re- 
sulting from the factor analyses of Sections 1 
through 5 were intercorrelated (using Pearson prod- 
uct-moment correlations) together with 14 items 
not hitherto included for analysis. Some of these 
14 items were surviving items from Sections 6 and 
7 concerning General Work Conditions and Gen- 
eral Job Characteristics; also included were several 
items relating to educational development which 
had been reserved for inclusion in this final analysis. 

A 28 X 28 correlation matrix was factored by the 
method of principal components programed for the 
Datatron computer. The highest correlations were 
used as trial communalities in a first factoring and, 
upon identification of the chief factors, re-estimated 
communalities were used in a second analysis. The 
decision to stop factoring was based on the rate of 
reduction of the residuals and rotation to orthogonal 
simple structure was effected by the graphic single- 
plane method. Jobs were given scores on the rotated, 
final factors variables by the weighting procedure 


previously described. 


RESULTS * 


Results of Initial Factor Analyses 


The five initial factor analyses resulted 
in 14 separate oblique group “factors.” These 
factors were typically bipolar, because to 
Some extent job activities had this mutually 
exclusive character, and partly because of the 
Way i ere defined. 

In Sclempeettil the factors, it should be 
noted that the basic data were descriptive 
9f jobs, and that the resulting factors were 
Characterized by combinations of coexistent 
Activities, The basic data are perhaps best 
Considered to be occupational (or eere 
Organizational) in nature, rather than per P 
logica]. The factors, then; consisted 0 xe 
Activities which tended to "hang together 
în jobs. Thus, it might be that two oF m 
Aistinetly different “types” of activity mig 
CI taining items of highest 


8 
lo, ES Seven-page table ap ng multiple group 


Ing and le jobs | à 
rectors has hen Sepad with the ate: Ps 
frg Aon Institute. Order D eaa Photo- 


dus» ADI Auxiliary P 
tom lication Service, Library of 


291 


be common to the same factor because of the 
fact that these activities typically occurred 
together in jobs. This point is emphasized be- 
cause factors from such data would not be 
expected to have the same kind of “psycho- 
logical” sense that would be true of relation- 
ships among measures of human test per- 
formance. 

The interpretation of the factors is indi- 
cated by the descriptive titles which are in- 
tended to summarize the general pattern of 
factor loadings. The multiple group factors 
are listed below according to the section of 
the check list from which they were derived. 
To illustrate each factor, sample jobs which 
typify jobs with high and low scores (in that 
order) on the factor are given in parentheses. 

Information-Receiving Activities. Factor 
1: Visual Information from Physical Objects 
vs. Displays (Scrap Inspector vs. Copy- 
writer) 

Factor 2: Visual vs. Tactual Information 
from Objects (Welder vs. Fireman) 

Mental Activities.’ Factor 3: Routine vs. 
Administrative Decision Making (Car 
Checker vs. Farm Manager) 

Factor 4: Supervisory and Professional De- 
cisions vs. Physical Activity (General Fore- 
man vs. Operator, Tube Rolling) 

Factor 5: Routine vs. Business Planning 
Decisions (Hydrostatic Tester vs. Supervisor 
of Commercial Markets) 

Supervisory and Communications Activi- 
ties® Factor 6: Routine Verbal Communica- 
tions vs. Signals or Managerial Contacts 
(Personnel Clerk vs. Medical Director) 
` Factor 7: Communications Given Verbally 
vs. Via Signals (Head Timekeeper vs. Ambu- 
lance Driver) 

Factor 8: Communications from Signals 
vs. Personal Contacts (Craneman vs. Market 
Specialist) 

Factor 9: Originating vs. Receiving Com- 
munications (Superintendent of Foundry vs. 
Clock Repairman) 


‘For information purposes, the correlation be- 
tween the group factors was ris= 35. 

5 The correlations among the group factors were 
fa m — 30, fe = 52, fa = — 30. 
6 The correlations among the group factors were 
.27, fe S55 ra 12, m= 64, 


ror = 12, Tes 


ee 32. 


292 George J. Palmer, Jr. and Ernest J. McCormick 


TABLE 1 
FACTOR Loaprncs 


Variable Loading 
Factor I: General Decision Making and Mental Activity 
Reasoning 91 
Language 91 
Decision Making vs. Physical Activity 85 
Originating vs. Receiving Communications .82 
Latitude of Job Activities 19 
Mathematics 32 
Communications Given Verbally vs. via Signals 10 
FACTOR SCORE LEVEL SAMPLE JOB 

High Medical Director 

Average Salesman 

Low Bricklayer Apprentice 

Factor II: Sedentary vs. Physical Work Activity 
Sedentary vs. Physical Activity 52 
Visual Information from Physical Objects vs. Displays —.46 
Routine vs. Business Planning Decisions —.47 
Manual Operations —.52 
General Physical Activities vs. Postural Restrictions —.63 
FACTOR Score LEVEL SAMPLE Jon 

High Interviewer 

Average Guard 

Low 


Hooker, Crane 
Factor III: Communications in Business M. 
Work 


Originating vs. Receiving Communications 


anagement vs. Information in Routine Physical 


31 
Communications from Signals vs. Personal Contacts — 38 
Routine Verbal Communications ys, Signals or Managerial Contacts — 39 
Routine vs. Administrative Decision Making — 44 
Routine vs. Business Planning Decisions — 48 
Visual Information from Physical Objects vs. Displays — 54 
FACTOR SCORE LEVEL SAMPLE Jon 

High Superintendent 

Average Clerk Typist 

Low Mechanical Repairman 

Factor IV: Knowledge of Tools vs. Mathematics 

Use of Craftsman’s Tools vs. Other Hand Implements A 
Use of Body of Knowledge 3 
Routine vs. Administrative Decision Making — 34 
General Physical Activities vs. Postural Restrictions —37 
Mathematics —52 


Factor Score LEVEL 


SAMPLE Jon 
High Instructor, Carpenter Apprentice 
Average Mill Shearman 
Low 


Payroll Clerk 


Factor Analysis 


Manual Activities; Factor 10: Manual 
Operations (Operator, Tube Rolling vs. Head 
Timekeeper) 

Factor 11: Use of Craftsman’s Tools vs. 
Other Hand Implements (Millwright vs. Re- 
search Technician) 

General Bodily Activities? Factor 12: Gen- 
eral Physical Activities vs. Postural Restric- 
tions (Helper, Axle-Maker vs. Safety In- 
spector) 

Factor 13: Sedentary vs. Physical Activity 
(Supervisor vs. Operator, Oiling Machine) 

Factor 14: Manual vs. General Bodily 
Activities (Time Study Man vs. Hooker, 
Crane) 


Results of Final Factor Analysis 


In the final factor analysis, a principal 
components analysis extracted four factors 
which may be considered higher-order (al- 
though not second-order) factors; all were 
based in part on factors derived from the 
initial factor analyses, and in part on the 
14 new variables introduced in this final 
analysis. The bipolar nature of some of these 
final factors resulted from two conditions: 
first, from the positive and negative loadings 
of the variables, and, second, from the bi- 
Dolar nature of the multiple group factors 
correlated with the final factors. Principal 
(orthogonally rotated) loading and € 
Jobs are given for each final factor, m T 
Scending order of importance (see Table ; 

Factor I accounts for 6676 of the varianci 
and is a general factor. Factors II, III, an 
IV account for 15 to 8% of the accum 
"actor I indicated that the ge 
€nces between jobs lie in Decision n rb 
and Mental Activities. Factor Ir in > iit 
differences in physical activity. 2 ot 
refers to various sources an met ods " 
Communications for originating, aie en 
1cceiving job information. Factor IV ERE 
‘nowledge or skill in the use of ae 
Perhaps relating to manua and aes j 
poral routines. The mathematical aspet o 
actor TV is, of course, independent of 73 
Tee or , 


actors was 
€ correlation between the group id 


Tuy 1 
= 00. rere 
8 factors we 
up fa 
te ne correlations among the EOD 
ES M ET TT 


of Job Activities 


293 


tor I on which mathematics also has a high 
loading. 

The emergence of Factor I, with its em- 
phasis on decision making and mental activi- 
ties as the most dominant factor, seems to 
be in keeping with the results of a number 
of job evaluation studies which have sug- 
gested that the most important performance 
influencing the evaluation of jobs are those 
associated with mental requirements, skill, 
experience, and related variables (Tiffin & 
McCormick, 1958). Designation of this fac- 
tor in terms of decision making activities 
gives it an interpretation consistent with the 
views of organization theorists, such as 
Simon (1948), who hold that, on the vertical 
dimension of organizations, the main differ- 
ence between jobs is related to the level of 
decision making. 


Limitations 


On the whole, the results show that a 
large number of job activities for a rather 
wide range of jobs can be organized mean- 
ingfully in terms of a small number of inde- 
pendent dimensions. However, the results 
should be interpreted with certain limitations 
in mind. 

First, it should be noted that the check 
list of worker oriented activities was viewed 
as a "first approximation" toward the de- 
velopment of what might ultimately be a 
much more refined device. Further, the sam- 
ple of jobs was clearly a very limited one, 
the sample consisting of jobs from only one 
industry. It should be pointed out that the 
job descriptions varied in length, detail, and 
format, depending upon the class of jobs in 
question and, also, that the check list was 
used with the job descriptions (the secondary 
source), rather than with direct observation 
of the jobs themselves. 


CONCLUSIONS 


This study was carried out as something 
of a “probing” project to test the feasibility 
of identifying what might be thought of as 
the “dimensions” of worker activities of 
jobs. Specifically, factor analyses of five sec- 
tions of the worker activities check list pro- 
duced 14 multiple-group factors. A subse- 
quent factor analysis of the jobs in terms 


294 


of their scores on these 14 factors and 14 
other selected variables produced four more 
general factors, ds follows: General Decision 
Making and Mental Activity, a general fac- 
tor; Sedentary versus Physical Work Activ- 
ity; Communications in Business Manage- 
ment versus Information in Routine Physical 
Work; and Knowledge of Tools versus 
Mathematics. 

In examining the original 14 factors and 
the four final factors, a number of them 
seem to make a reasonable amount of “logi- 
cal" sense. Certain of the factors, however, 
seem to be rather distinctly influenced by 
what apparently were fairly unique facets of 
some of the jobs in the particular sample 
in question. 

It may be concluded that the results tend 
to support the view that work activities can 
be identified or measured and that the variety 
of human work activities may be organized 
with greater simplicity and economy in terms 


of a smaller number of relatively independ- 
ent dimensions. 


REFERENCES 


Brswamp, G. G. Shred-outs of ta: 
senior B-29 mechanics. 
Res. Cent. tech. Rep., 


sks performed by 
USAF Personnel Train. 
1954, No. 54-4. 


George J. Palmer, Jr. and Ernest J. McCormick 


Coomes, C. H., & Sarrer, G. A. A factorial ap- 
proach to job families. Psychometrika, 1949, 14, 
33-42. 

Cooperative Wace Bureau. Job description and 
classification manual for hourly-rated production, 
maintenance, and nonconfidential clerical jobs. 
Pittsburgh: CWB, 1953. j 

FLEISHMAN, E. A. The description and prediction 
of perceptual-motor skill learning. Presented at 
Symposium on Training Research and Its Impli- 
cations for Education, University of Pittsburgh, | 
February 1-3, 1960. 

Guirrogp, J. P. Psychometric methods. New York: 
McGraw-Hill, 1954. 

Hempuiit, J. K. Dimensions of executive positions: 
A study of the basic characteristics of the po- 
sitions of ninety-three business executives. Educ. 
Test. Serv. res. Bull, 1959, No. 59-5. 

McCormick, E. J., Fwy, R. H., & Scnetps, C. P. 
Patterns of job requirements. J. appl. Psychol, 
1957, 41, 358-364. i 

Peters, C. C, & Van Voormis, W. R. Statistical 
procedure and their mathematical bases. NeW 
York: McGraw-Hill, 1940. i 

Srwox, H. A. Administrative behavior. New York: 
Macmillan, 1949, ] 

Tuurstone, L., L. Multiple factor analysis. Chi- 
cago: Univer. Chicago Press, 1947. 

Trrrix, J, & McCormick, E, J. Industrial ius 
chology. Englewood Clifis, N. J.: Prentice-Hall; 
1958. 


(Received August 31, 1960) 


00 t ug Pm em veo mm -s————— Án ——-— 
— A, 


Journal of Applied Psychology 
1961, Vol. 45, No. 5, 295-302 


USE OF ANALYTICAL INFORMATION CONCERNING 
TASK REQUIREMENTS TO INCREASE THE 
EFFECTIVENESS OF SKILL TRAINING: 


JAMES F. PARKER, JR. 


Psychological Research Associates 


The design of training programs for com- 
plex skills customarily involves a combina- 
tion of expert judgment and known principles 
of learning. Often these programs are highly 
successful. However, to any psychologist in- 
volved with practical training problems, the 
gap between learning principles developed 
in the laboratory and the application of these 
to training practice becomes painfully ap- 
parent (DuBois & Manning, 1957; Gagné 
& Bolles, 1959; McGehee, 1958). There are 
many reasons for this including a lack of 
knowledge of the manner in which principles 
of learning are related to particular char- 
acteristics of the tasks learned. Furthermore, 
laboratory studies of learning frequently are 
concerned with treatment variables which 
are apt to have only a small practical effect 
in training situations relative to the large 
effects of individual difference variables 
(e.g, abilities) (DuBois, 1960; Fleishman & 
Hempel, 1955). The problems of task dimen- 
sions and ability differences are not unre- 
lated and can be handled in the same con- 
Ceptual framework. For example, tasks can 
be described in terms of the abilities re- 
quired to perform them (Fleishman, 1960). 

The present report evaluates à training 
Program in which use was made of analyti- 
Cal information about ability Hohen 
of the task to be learned. Two pe S 
analytical information were used. The Pus 
involved knowledge about the ptio e 
Abilities (e.g., spatial orientation) which c E 
tribute to performance om the task in quo 
tion, The second type of information Con- 
ce i ibution of the 

‘ths the relative contri x. 
different specific task components (e.g; 
BS 


under Contract 
ical Research As- 


AND EDWIN A. FLEISHMAN 


Yale University 


trol stick or pedal movements) to over-all 
task proficiency. In this latter case one can 
ask if the specific ability to use the pedals 
at a given stage of training is highly related 
to learning rate and to final proficiency; if 
so, can this information be used effectively 
in training? In applying information about 
both kinds of task components (general 
abilities and specific intratask relationships), 
special attention was given to their contribu- 
tions to individual differences at different 
stages of training on the task in question. 

We will proceed in the following order: 
We will describe the task to be learned, how 
the analytical information was obtained, the 
training program developed, and the evalua- 
tion of the program. 


Tur TRAINING Task 
Display-Control Characteristics 


The task to be learned was a complex 
tracking task, described in detail in previous 
reports (Parker & Fleishman, 1959, 1960). 
A device was constructed so as to simulate 
roughly the display characteristics and con- 
trol requirements of an interceptor aircraft. 
The subject envisioned himself to be flying 
the attack phase of an airborne radar inter- 
cept mission. His task was to keep a target 
dot in the center of an oscillograph display 
while at the same time trying to keep a 
pointer centered on an instrument dial, desig- 
nated a “sideslip indicator.” The controls 
used by the subject involved a standard air- 
craft control system of stick and rudder ped- 
als. These were coupled in a manner similar 
to those of an actual aircraft. Thus, if the 
target dot on the scope was to the right 
the subject made appropriate control move- 
ments to steer the “craft” to the right. These 
movements would bring him “on target” 
and the target dot would return to the center 


© same title prepared for o 295 


296 


of the scope display. As in an aircraft, all 
such turning movements required coordi- 
nated action of stick and rudder controls. 
Thus, application of right control pressure 
without the proper amount of right rudder 
produced a sideslip to the left and a conse- 
quent left deflection on the “sideslip indi- 
cator.” This indicator provided the subject 
with additional information regarding the 
“degree of coordination” he was achieving 
in centering the dot. The target course was 
programed through an analog computer. The 
control stick was an acceleration control; 
that is, the extent of the subject’s stick move- 
ment was directly related to the acceleration 
of dot displacement. The rudder pedals ap- 
proximated a pure velocity control. The task 
was a very difficult one for subjects to mas- 
ter. Photographs of the device may be found 
elsewhere (Parker & Fleishman, 1959, 1960). 


Performance Scores 


The following five performance measures 
were obtained during each testing session. 
Four of these were error scores; one was 
a time-on-target score. The error scores rep- 
resent the extent of target dot (or indicator 
needle) displacement summated through 
time. 

Integrated absolute error score. This was 
recorded at the conclusion of every trial and 
was produced by summing algebraically the 
three absolute error part scores described in 
accordance with this relationship: 


T=1/2X+1/2V+Z ul 


where: T = Integrated absolute error score 
X = Absolute azimuth error 
Y = Absolute elevation error 
Z = Sideslip error 


Each of the following part scores was re- 
corded every third trial. An automatic se- 
quencing mechanism switched from one part 
score to the next at the conclusion of each 
trial. 

Azimuth (yaw) part score. A summation 
of the total error occurring in the horizontal 
axis of the oscillograph display 

Elevation (pitch) part score. A summation 
of the total error occurring in the vertical 
axis of the display 


James F. Parker, Jr. and Edwin A. Fleishman 


Sideslip error part score. A measure of 
the lack of coordination between stick and 
rudder movements 

Time-on-target score. The time during the 
trial when the target dot and the sideslip 
indicator needle were both (simultaneously) 
within prescribed tolerance limits 


THE ANALYSIS OF TRACKING 
PERFORMANCE 


We now turn to the analysis made of the 
component task requirements for this pat- 
ticular tracking task. The rationale and de- 
tails of this analysis have been reported else- 
where (Parker & Fleishman, 1959, 1960). We 
will summarize this briefly and then describe 
the application of this analysis to the ex 
perimental training program. 


Derivation of Ability Requirements 


The approach here was an extension of ? 
series of studies by Fleishman (1957, 1960); 
Fleishman and Fruchter (1960), Fleishman 
and Hempel (1954, 1955), and Parker and 
Fleishman (1959, 1960). These studies muse 
the fact of individual differences to probe int? 
the ability requirements of a task. These 
studies have demonstrated that in learning 
a complex task the particular abilities co" 
tributing to proficiency may change system 
atically at different stages of practice. The 
suggests the possibility that a more e 
method of training might take into accoun 
this changing pattern of abilities. il- 

Information regarding the changing m 
ity requirements of the tracking task ce 
scribed above was obtained in a previo" 
study (Parker & Fleishman, 1959). A com 
prehensive battery of 44 reference tests 9 
known factorial structure was administer? 
to 203 subjects who also mastered the Ha 
ing task. There were 17 tracking E 
each consisting of 21 one-minute „trials. be 
intercorrelations among the reference M: 
were factor analyzed. The tracking perfor 
ance scores (integrated absolute error me 
ure) at 10 selected stages of practice d 
were projected onto the rotated axes define 
by the reference battery, using Mosier $ A 
tension method (Fruchter, 1954). This m 
vided the factor loadings of each track! 


Ls 


Effectiveness of Skill Training 


— SPATIAL ORIENTATION 
X—XMULTILIMB COORDINATION 


hw i 


o 


PERCENTAGE OF VARIANCE 
INTEGRATED ERROR SCORE 
O — M u 5 m Oo 4 0 0 


TIME SEGMENTS “ 
20 25 30 35, 40 45 


Pots Coa 


tito 
6 7 8 9 IO II I2 I3 I4 I5 16 I7 
PRACTICE SESSIONS 


Changes occurring in the importance of 
factors with increased practice in 


Fic. 1. 
two ability 
tracking. 


practice stage, and indicated the amount of 
variance in tracking at that stage of prac- 
tice which could be accounted for by each 
ability factor. : — 
Although 15 factors? were identified in 
the reference battery, only two, Spatial Ori- 
entation and Multilimb Coordination, con- 
tributed significant variance of a systematic 
Nature to tracking performance. ] 
Figure 1 presents the contribution of these 
two factors through the training program. 
It can be seen that aíter Practice Session 
3, there is a decrease in the contribution 
Of Spatial Orientation to tracking proficiency. 
On the other hand, the Multilimb Coordina- 
tion factor increases in importance up to 
Session nine. We will return to the use made 
9f this information in the subsequent experi- 


Mental training program. 


ibuti task 
Determining the Contribution of Subta. 


°mponents 


Š As described above, 1 
Ponent performances (azimut 
* ial Orienta- 
tig, The factors identified were: I. m o jd 
n, IL Control Precision, IL Pacii Time, 
V PVement, TV, Manual Dexterity, V. RAE dediti 
tio, Verbal Comprehension, VII. mee Perceptual 
^ VIII Arm-Hand Steadiness, ion XII. Pur- 
sug 4 X. Visualization, XI. pero nical. Experi- 
ene, Confusion Doublet, XIII. Mee Multilimb 
n IV. Finger Dexterity, aNd ^7 
Mation T T 
? s divided into 
51 4,79 17 practice sessions were sub b er det 
d p Segments in order to a of Fig- 
x inse picts kage THE session to time 
Set en rows the relation of prac d selected for use 
in the and to the 10 time segments 


ata analyses, 


measures of com- 
h error, eleva- 


297 


tion error, and sideslip error) were obtained 
along with measures of over-all performance 
(integrated error and time-on-target) during 
the course of practice. In a complex task 
such components are related to and form a 
part of over-all proficiency, although these 
precise relationships are seldom known. For 
the present task the correlations among all 
these component scores with each other and 
with the over-all proficiency measures were 
obtained for 10 different stages of practice 
(Parker & Fleishman, 1959). Of particular 
interest for the present study are the correla- 
tions of each component score, recorded at 
each stage of practice, with final over-all 
proficiency recorded in the last stage of 
practice (fiftieth time segment °). These cor- 
relations appear in Table 1. 

It can be seen that how well a person does 
in the component performance (using stick, 
rudder, etc.) in the early stages of learning 
is not very predictive of final over-all pro- 
ficiency. (It might be mentioned that early 
over-all proficiency did not predict final 
level either.) However, as practice continued 
prediction from all components increased, 
with the best predictions obtained from the 
“sideslip” score during later practice sessions. 
Since this component score reflected the ex- 
tent of rudder pedal and stick coordination 
we have further confirmation of the factor 
analysis results, which indicated an increase 
in the contribution of the Multilimb Coordi- 
nation factor. 

On the assumption that knowledge of the 
relative contribution of component perform- 
ances can provide guidance for training. this 
information was used in the experimental 
training program to be described next. 


TABLE 1 
CORRELATIONS OF COMPONENT SCORES AT Various 
STAGES OF PRACTICE WITH TERMINAL INTEGRATED 
Error Score 


Time Segment 


Error Score 1 3 6 8 10 15 25 34 43 


Azimuth 02 —07 03 11 19 27 28 34 49 
Elevation 08 10 12 17 26 34 36 42 52 
Sideslip 12 04 —04 00 13 30 34 51 65 


Note.— Decimal points omitted. 


298 


Tue EXPERIMENTAL TRAINING STUDY 


Three groups of subjects were compared. Groups 
I and Il are regarded as control groups and Group 
III is the experimental group. These groups differed 
only in the nature of the verbal instructions ad- 
ministered during the course of practice. é 

All groups were drawn from the same popula- 
tion: freshmen Air Force ROTC students at the 
University of Maryland. All students were volun- 
teers and were paid for their participation in the 
testing program. The training period was identical 
for both control groups and the experimental group: 
17 sessions, distributed over 6 weeks, with each 
session consisting of 21 one-minute trials in track- 
A I—no formal training. The first control 
group (N —203) was the group described pre- 
viously. It provided the data for the correlational 
and factor analyses of tracking performance. This 
group received no formal training on the tracking 

task other than a brief explanation of the controls, 

the display, and the task. During the entire learn- 
ing period these subjects received no instruction of 
any kind other than the answering of questions. 

They were told their score following each trial, 

however, in order that they might evaluate their 

individual progress. 

Group Il—"common sense” training. The second 
group (N-61) was administered a “common 
sense" type of training program constructed to be 
as much like operational military training for this 
type of task as possible. Discussions were held with 
qualified navy personnel to insure that this type of 
training was realistic in terms of present navy 
practices. This group provided a basis for de- 
termining whether the experimental training repre- 
sented an improvement over current practices when 
both were compared with Group I. This program 
involved an initial explanation and demonstration, 
guidance, and assistance to subject during early 
trials, and later individual practice with critiques 
following certain sessions. 

Group III—the experimental training. It will be 


recalled that this study was based upon two 
premises: 


1. When it is known that an ability is important 
at one point in the practice schedule, the verbal 
emphasis of that ability at an earlier point in time 
will facilitate learning. 

2. A knowledge of the learning of component 
activities within the tracking task and the inter- 
relation of these scores with terminal proficiency 
can be used as a means of structuring training. 

The above premises were the basis of the ex- 
OHNE training program for Group III (N= 

As Figure 1 indicated, Spatial Orientation 
achieved maximum importance at the fourth track- 
ing session. Therefore, training instructions were 
developed to be used prior to and during the third 
session of tracking. The specific instructions were 
developed in an attempt to give insight into the 


James F. Parker, Jr. and Edwin A. Fleishman 


Spatial Orientation requirements of this task. The 
relationship between target position on the scope 
and the required direction of movement for both 
stick and rudder controls was described carefully. 
The Multilimb Coordination factor starts a rapid 
rise in importance at the end of the fifth session 
and rises to maximum importance at the ninth 
session (Figure 1). Therefore, training instructions 
were developed, for administration prior to and 
during the fifth session, which emphasized the 
simultaneous use of control stick and rudder in 
controlling movement of the target dot. 

The component tracking score information pre- 
sented in Table 1 was utilized in the following 
way for Group III. Prior to and during both the 
ninth and eleventh sessions (Time Segments 25 
and 31, respectively 3), instructions reemphasized 
the importance of centering the sideslip indicator 
and suggested the importance of monitoring that 
indicator rapidly and frequently while continuing 
to control the target dot on the oscillograph. 

It should be noted that the experimental program 
administered to Group III was “overlaid” on the 
"common sense" training program which was ad- 
ministered to Group II. That is, all procedures 
used with Group II, such as initial indoctrination, 
early monitorship and guidance, and later critiques 
were administered to Group III in conjunction with 
the experimental program. This was done in order 
to allow a direct evaluation of any improvement 
in training effectiveness as a result of the addition 
of the experimental portions. 

Every effort was made to keep motivation high 
in all three groups. There was every indication 


that subjects in all three groups were highly 
motivated to do their best. 


RESULTS 


A check on the initial matching of the 
three groups indicated the groups matched 
on time-on-target (see Figure 3), azimut 
error, and elevation error measures. HOW- 
ever, the integrated error and sideslip error 
scores indicated a slight initial superiority ` 
for Group III (p< .05). A check on this 
revealed this was due to an inadvertent 10" 
struction given these subjects during thelr 
initial session to center the rudder peda s 
Since this was such an easy operation, it hor 
felt that this effect was transitory and di 
not affect the results. This assumption See b- 
tenable in view of the parallel results © A 
tained with the time-on-target score IM 
ure 3), on which the groups were initia 
matched. 

Figures 2 and 3 present learning 
for all three groups in terms of total Med 
grated error" and “time-on-target meas 


curve? 
«inte7 


Effectiveness of Skill Training 


2500 500 
d — GROUP I — GROUP 1 
[3 o—o GROUP Ir o— GROUP I / 
& 20007 »—* GROUP IT w 400} *— GROUP IT 
g $ 
= 8 
a3 1500 E 
ER 5 300 
SE = 
Q5 " 
W$ 1000: 2 200 
Fu $ 
E] w 
=z 
E eil É 100 
TIME SEGMENTS p 
MIEL reir J ae 30 35 o 
CUIUJUCHLO LJ LILJUO L3 CJ CJ C3 C3 Uoto [=] 
See ST TPT PI 6i PSST SETS SOT BS iT 
PRACTICE SESSIONS PRACTICE SESSIONS 
Fic. 2. Comparison of integrated absolute error Fic. 3. Comparison of time-on-target measures for 
Groups I, II, and III. 


measures for Groups I, II, and III. 


In terms of each measure there is some in- 
dication of the superiority of the experi- 
mental procedures over each of the control 
Procedures. Group II (common sense train- 
ing) appears substantially superior to Group 
I (no formal training). Group III (experi- 
mental training) is superior to Group II 
but the magnitude of this superiority is not 
as large as that shown by Group II over 


Group I. 
Analyses of v 
Were performed to deter 
Of this apparent effect. 
Clearly superior to Group I, only Groups II 
and III were compared. This analysis (Table 
*) compares over-all group differences dur- 
Ng the entire practice period, differences 
between practice segments, as well as the 
Practice segment-group interaction. A sig- 
Micant interaction would indicate a differ- 


ariance (Edwards, 1951) 
mine the significance 
Since Group III is 


ential learning rate or a different form of 
learning curve for one group compared with 
the other. 

These analyses indicate that, for both 
measures of performance, the experimental 
group is significantly superior to the control 
group. For the time-on-target scores (Table 
3) the Practice Segment X Groups interac- 
tion term is significant indicating that one 
group has either a differential rate of learn- 
ing or a different form of learning curve. This 
teraction term is not significant for the 
integrated error measure, however. 

An additional analysis was conducted to 
determine if the superiority of Group III 
could be demonstrated using only a measure 
of terminal performance. Table 4 presents 
the results. The scores used in this analysis 
he average of the last nine seg- 


represented t 
ormance for each subject. Thus, 


ments of perf 


TABLE 2 


D ON INTEGRATED Error SCORES 


z ARIANCE BASE 
ANALYSIS OF SS rats IÍ AND € 
a EARS [S F 
- = J 
" df SS 
Source 
208,180 6.25* 
Group II vs. Group IIT i i 3 ont 33,284 : 
NE (Ss in groups) 50 14,471,481 =e icd 
ime Segments P 13,701 z 
Segment X Group 5) 
Interaction è Lm 
Error (Ss X segments 5,950 3,537,650 3 
within groups) be 22,191,766 
» » 


Total 


* Si 
™ signigcant at .05 level. 
Cant at .01 level. 


300 


James F. Parker, Jr. and Edwin A. Fleishman 


TABLE 3 
ANALYSIS OF VARIANCE BASED ON Tre-on-Tarcet Scores rog Groups II anp III 


Source df SS MS F 

Group II vs. Group III 1 21,255 21,255 4.64* 
Error (Ss in groups) 119 544,131 4,573 
Time Segments 50 1,320,240 26,405 546.70** 
Segment X Group 50 5,767 115 US 
Interaction 
Error (Ss X segments 

within groups) 5,950 287,160 48 

Total 6,170 2,179,209. 


* Significant at .05 level, 
** Significant at .01 level. 


the score of each individual represents 54 
minutes (nine segments times 6 minutes per 
segment) of tracking performance. This ex- 
panded measure was used in an attempt to 
obtain a stable index of terminal perform- 
ance for each subject. In terms of integrated 
error there is a significant indication of the 
superiority of the experimental procedures. 
For the time-on-target measures the superior- 
ity of the experimental procedures approaches 
statistical significance. Inasmuch as the vari- 
ances of Group II and Group III were 
heterogeneous with respect to both measures, 
the £ tests were run in accordance with the 
Cochran-Cox method as presented by John- 
son (1949, p. 75). This procedure takes into 
account the heterogeneity of variances. 

From the above results it appears that the 
experimental training procedures operated 
both to improve group performance and to 
decrease group variability. 


TABLE 4 


RESULTS or Tests COMPARING Group II AND Group 
II wire RESPECT TO TERMINAL 
PROFICIENCY 


Mean SD Difference t 


Integrated Error (average of last nine segments) 


Group II 306.6 229.1 85.5 2.65* 
Group IIT 221.1 104.1 

‘Time-on-Target (average of last nine segments) 
Group II 437.3 113.3 33.0 1.77 
Group III 470.3 91.0 


* Significant at .05 level. 


The analysis presented in Table 4 indi- 
cates the experimental training procedures 
to be superior in terms of resulting terminal 
proficiency. However, this conclusion is base 
primarily upon use of over-all performance 
(integrated error and time-on-target) meas- 
ures. The integrated error measure in tur? 
is a function of three component scores: 
azimuth error, elevation error, and sideslip 
error as related in Equation I. The questio? 
then presents itself: Is the conclusion COT" 
cerning the superiority of the experimenta 
procedures dependent upon a particular co™ 
ponent score? That is, are one or more a 
the component scores producing the maj? 
portion of the differences between the hear ] 

In order to answer this question subjec a 
scores on each of the three separate po 
ponent measures were examined. For this 
purpose, scores during the final nine seg 
ments of practice were summed. Tab 
presents the results of ¢ tests compar ne 
Groups II and III with respect to each co f 
ponent score. Inasmuch as the variances | 5 
Group II and Group III were heterogen® t 
with respect to each component score, 
tests again were run in accordance wilt hn- 
Cochran-Cox method as presented by Je | 
son (1949). f the 

Table 5 indicates that for two ° desliP 
component measures (elevation and si r to 
error) Group III is significantly super’? nce 
Group II. For azimuth error the Ge 
approaches significance (p = .07). It apP the 
then that for all component measures o to 
experimental training procedures oper? 


"a 


Effectiveness of Skill Training 301 


TABLE 5 
Resvutts or Tests COMPARING GROUP IT AND Group 
III WITH RESPECT TO COMPONENT PERFORMANCE 
MEASURES 


Mean E Difference t 


Azimuth Error Score (average of last nine segments) 
691.6 591.7 158.8 1.88 
532.8 292.8 


Group II 
Group III 


Elevation Error Score (average of last nine segments) 
371. 380.6 178.8 3.43** 
192.4 145.9 


Group II 
Group III 


Sideslip Error Score (average of last nine segments) 


472.8 266.6 187.5 4.90** 
285.3 134.1 


Group IT 
Group III 


** Significant at .01 level. 


improve performance and to decrease vari- 
ability. There is no indication that the su- 
Periority of Group III, as indicated by the 
integrated absolute error measure, is de- 
Pendent mainly upon improvement in one 
Particular component measure. Each com- 
Ponent task activity contributes to the over- 
all performance superiority of Group III. 


DISCUSSION 


. This study investigated whether certain 
“Inds of analytical information concerning 
task requirements might be used to increase 
© efficiency of a training program. The 
data or this study support the conclusion 
EU training procedures based upon analyti- 
information concerning task requirements 
increase the effectiveness of training. 
Mbiguous results were obtained with re- 
Pect to the question of a different rate (and 
cessibly a different form) for the learning 


Can 


i © of the experimental versus that of the 
ceq "ol group. With analysis of variance pro- 

utes the time-on-target measure indicates 
exp eificantly faster rate of learning for the 
the j mental group. The analysis based on 
these tegrated error measure did not confirm 
lieve results, In general, however, it is be- 
Pedure at the experimental training e 
n an» Used in this study result primarily 
Curve p Crease in the slope of the learning 

Not in its basic character. 


It should be noted that no experimental 
procedures were introduced after Session 11. 
Groups II and III were handled in an identi- 
cal manner during the subsequent days rep- 
resented by the terminal proficiency meas- 
ures. In other words, these last trials may 
be regarded as trials during which transfer 
effects of the experimental procedures might 
be evaluated. The results indicate that with- 
drawal of the supplementary training instruc- 
tions did not result in a decrease in per- 
formance for the experimental group. We 
may infer, therefore, that changes in learn- 
ing occurred (with consequent transfer to 
later performances) rather than just changes 
in performance in the presence of these ex- 
perimental procedures during training. 

The fact that the experimental training 
procedures resulted in a superiority for 
Group III has certain additional implications, 
It will be recalled that much of the experi- 
mental training procedures were derived 
from information concerning the importance 
of underlying ability factors at different 
stages of mastery of the tracking task. The 
fact that emphasis of these abilities at ap- 
propriate points within the training cycle 
did result in an improvement in training 
would seem to bear certain evidence con- 
cerning abilities identified through factor 
analytic procedures. These abilities may be 
more than just descriptive categories. They 
may, in fact, represent some mediating proc- 
ess within the organism. 

The integrated error measure indicates a 
39% improvement in training effectiveness 
at the terminal stages of training. Yet the 
original analysis of the tracking task upon 
which the experimental training procedures 
were based was not as fruitful aS originally 
hoped. Of the 15 ability factors defined by a 
factor analysis of the battery of reference 
measures, only 2 showed a Systematic rela. 
tion to tracking performance. There 
these 2 could be used in structurin 
perimental training program, As fi 
search into the nature of the a 
quired by tracking tasks or othe 
similar complexity is accomplished 
be possible to establish training 
still more effective than used in t 


fore, only 
g the ex- 
urther re- 
bilities re. 
r tasks of 
; it should 
Programs 
his Study, 


302 


SUMMARY 


The objective of this study was to investi- 
gate the extent to which the effectiveness of 
training for a complex perceptual-motor ac- 
tivity might be increased through use of 
training procedures based upon a detailed 
analysis of task requirements. Specifically, a 
realistic tracking task was analyzed in terms 
of the ability factors which were important 
contributors to proficiency at different stages 
of practice and the relation of component 
task performances to terminal over-all pro- 
ficiency. 

Three groups of subjects were used. The 
three groups were trained on the same track- 
ing task and for identical periods of time. 
Group I (N — 203) received no formal train- 
ing other than the answering of questions. 
Group II (N — 60) received a “common 
sense” training program using standard 
pedagogical techniques. Group III (N = 60), 
the experimental group, received the same 
training as Group II plus a set of standard 
instructions derived from the analysis of task 
requirements, 


The following conclusions are drawn: 


l. There is an indication of consistent su- 
periority for the experimental procedures 
throughout the course of practice. These pro- 
cedures operate both to increase proficiency 
and to decrease within-group variability. 

2. There is no evidence that the experi- 
mental program operated differentially upon 
component task performances such as azi- 
muth error, elevation error, or “sideslip” 
error. 

3. Results indicate that the slopes of the 
learning functions differed, but this may not 
mean the basic form of the learning curve 
was different for each group. 

4. The Superiority of the experimental 
group persisted in later tracking sessions, 


when the experimental instructions were no 
longer introduced. 


James F. Parker, Jr. and Edwin A. Fleishman 


REFERENCES 


DuBois, P. H. The design of correlational studies 
in training. Paper presented at Symposium On 
Training Research and Its Implications for Edu- 
cation, University of Pittsburgh, 1960. a 

DuBois, P. H., & Manninc, W. H. (Ed.) Methods 
of research in technical training. Technical Report 
No. 3, May 1957, Washington University, ONR 
816(02). (A report of a conference held at the 
Naval Air Station, Memphis; comments by Wilse 
B. Webb.) ; 

Epwarps, A. L. Experimental design im psychologi- 
cal research. New York: Rinehart, 1951. : 

FLEISHMAN, E. A. A comparative study of aptitude 
patterns in unskilled and skilled psychomotor per 
formances. J. appl. Psychol, 1957, 41, 263-272: 

FreisuMan, E. A. Description and prediction 9 
perceptual-motor skill learning. Paper presente 
at Symposium on Training Research and Its 
Implications for Education, University of Pitts- 
burgh, 1960. 

Fersman, E. A, & Frucuter, B. Factor struc- 
ture and predictability of successive stages 4 
learning Morse code. J. appl. Psychol., 1960, 44, 
97-101. J 

Freisuman, E. A, & Hemprr, W. E. Changes iE 
factor structure of a complex psychomotor bei 
as a function of practice. Psychometrika, 1997 
19, 239-232. E 

Freisuman, E. A, & HrMPrL, W. E. The relate 
between abilities and improvement with practic 
in a visual discrimination reaction test. J. €! 
Psychol., 1955, 49, 301-312. ve NEW 

Frucuter, B. Introduction to factor analysis. > 
York: Van Nostrand, 1954. tors 

Gacné, R. M., & Borres, R. C. A review of fac ES 
in learning efficiency, In E. Galanter (Ed.), oor 
matic teaching: The state of the art. New 
Wiley, 1959. = saan 

Jounson, P. O. Statistical methods in rese 
New York: Prentice-Hall, 1949. pout 

McGeuner, W. Are we using what we know Sones 
training? Learning theory and training. Per 
nel Psychol., 1958, 11, 1-12. me o 

Parker, J. F., & FLEISHMAN, E. A. Prediction i 
advanced levels of proficency in a complex tt $ 
ing task. USAF WADC tech Rep., 1959, No- 
255. , m 

PARKER, J. F, & FrrisuMaN, E. A. Ability facies 
and component performance measures E 
dictors of complex tracking behayior. Ps? 
Monogr., 1960, 74(16, Whole No. 503). 


(Received September 2, 1960) 


chol- 


Journal of Applied Psychology 
1961, Vol. 45, No. 5, 303-308 


OPERATOR PERFORMANCE ON A CHORD KEYBOARD 


H. C. RATZ 


University of Saskatchewan 


The man-machine link has received con- 
Siderable attention both in simple feedback 
Systems (Hick, 1952) and as a data processing 
unit. In a computer system, the human oper- 
ator may be employed because of his ability 
to translate between plain language and ma- 
chine language, while in a control system his 
ability to recognize complex patterns in per- 
formance data is exploited (Elkind & Forgie, 
1959), . 

The performance of an operator in choice 
and sorting problems has been treated by 
Several investigators (Crossman, 1953). They 
have found the “choice reaction time" to be 
Proportional to the stimulus complexity as 
measured by the logarithm of the number of 
Alternatives. Experimental variations have in- 
Cluded frequency unbalance among the alter- 
hative stimuli (Hyman, 1953), and the addi- 
tion of extraneous information not relevant to 
he required choice (Archer, 1954; Gregg, 
1954). Fitts (1953) has pointed out that an 
SPerator’s data processing capability is sub- 
ject to the fixed constraints of his motor sys- 
“m. These constraints are implicit in the form 
9f the response code and determine his effec- 
tive Capacity, For example, a telegraph oper- 
ator Produces time sequences with one hand, 
While the teletypist uses 10 fingers one at a 

Ie, The fact that motor constraints have a 
P'nsiderable influence in the latter case is 
‘nown from the improvements that can be 
SS teved through keyboard and language v 
Dvorak, Merric, Dealey, & Ford, 1936). 
he type of keyboard studied in this paper 
Ore akin to that used by the stenotypist 
employs several fingers simultaneously. 
of Present, the typewriter remains the ps 

Sctive man-computer communication lin 
lider, 1960) although other aspects of 

; The . ledge the assistance 
ine Btidanes s ee am M. 3 mep Ey a 
ta] ee lation of these experiments and in 


the Of th he valuable comments on 
Tn; the data, The de vis 
"hbrecitescript by George A. McMurray are 


is 


AND 


D. K. RITCHIE 1 


Ferranti-Packard Limited, Toronto 


computer systems such as displays and pro- 
graming language have advanced consider- 
ably. The multiple finger or “chord” keyboard 
may be a useful alternative in some special 
purpose data processing systems. 

The relative difficulty of the various chords 
can be measured in terms of the reaction time 
in responding on the keyboard to a visual 
presentation of the chord pattern. Since lights, 
fingers, and keys are in direct correspondence, 
the stimulus-response codes are highly com- 
patible; that is, mental recoding of the in- 
formation is avoided. Thus we obtain an 
experimental assessment of the performance 
of an operator using this chosen Set of re- 
sponse motor tasks, Using these data and the 
principles of optimum coding, the more fre- 
quently used message units can be assigned to 
the easier response tasks. This optimum dis- 
tribution of message units will minimize the 
average time per message and maximize the 
rate of information processing (Blachman, 
1954). 


MxrrHop 
Apparatus 


A block diagram of the experimental apparatus is 
shown in Figure 1. A paper tape reader provided se- 
quences of patterns for the lamp display. The opera- 
tor responded to each pattern in turn by depressing 
the corresponding keys of the keyboard. A complete 
temporal record of both the display and response ac- 
tivity was obtained from a 20-pen recorder, Ten of 
the pens recorded the motions of the keyboard re- 
sponses, while the other 10 recorded the timing of 
the lights. Thus reaction time, errors, latency, etc., 
could all be obtained from the Paper strip record, 

The keyboard contained 10 keys mounted on a 
horizontal surface and placed at the Positions oc- 
cupied by the fingertips with the fingers slightly 
curled and the wrists straight. The keys operated 
snap action switches through a lever Which required 
about three-eighths inch travel at the key, and quite 
low pressure. These switches drove the Tecording pens 
and, when all keys were released, signaled the tape 
reader to present the next Stimulus pattern. The lights 
used in the stimulus display were Dlaced in two 
groups of five on a horizontal line above and be. 
yond the keyboard. They were mounted on a sur- 
face tilted so as to be roughly Perpendicular to the 
line of sight. 


303 


304 


PAPER TAPE 


e0eo0 00000 
STIMULUS 
DISPLAY LAMPS 


AUTOMATIC 
CONTROL 
EQUIPMENT 


RESPONSE 
KEYBOARD 


O 


SUBJECT 


Fic. 1. Block diagram of the experimental ap- 
paratus. (The flow of stimulus and response informa- 
tion is indicated. Control signals are omitted.) 


In a typical experiment, the operator depressed the 
keys on the keyboard which corresponded to the 
combination of illuminated lights presented to him. 
That is, the operator transcribed light “chords” into 
key “chords.” When all keys were released by the 
operator, a new combination of lights would appear 
whether or not the correct chord had been struck. 

The function of the control apparatus was to auto- 
matically step on the tape reader and to transfer the 
combinations punched on the tape to the lamps. The 
method of operation as described above is one in 
which the operator paces himself; that is, the new 
stimulus was automatically presented after the keys 
Were released from the previous response. Although 
not used in these experiments, forced pacing at vari- 
ous speeds, and with a choice of ratio of “on time” 
(stimulus presented) to “off time” (dead time be- 


tween stimuli), could be programed with the same 
control apparatus. 


Stimuli 


The tape reader was a standard type used with 
computers and employed five-hole tape. For experi- 
ments using chords in only one hand, successive char- 
acters on the tape provided the stimulus combina- 
tions. By using a pair of characters from the tape 
for each presentation, a full 10-light pattern was 
achieved. This choice of program could be preset at 
the control apparatus. Thus when both hands were 
used the reader passed two characters of the tape at 
a time. 

Since the sequence of light patterns depended only 
on the characters of the punched tape, many experi- 
ments with various sets of chords could be easily 
programed. For example, one tape employed all the 
31 different possible combinations with equal fre- 
quency but in random order; another employed only 


H. C. Ratz and D. K. Ritchie 


those 15 different chords per hand which are pos- 
sible using either one or two fingers. 


Subjects 


Six operators were used in the experiments. Three 
of these used both hands throughout, while the other 
three used only the hand of preference, which was 
the right hand. Each operator practiced 10 minutes 
a day on random sequences of the 31 chords. ah 
performance was completely recorded for the fits 
and last 100 seconds of each 10-minute interval. , 

Although the task required in these experiments E 
a simple motor reaction, there is an initial shor 
learning period. By checking the average reaction 
time of a few of the chords, it was observed that 
little improvement took place after the second day. 
Asymptotic behavior was reached rather quickly pte 
cause the order of the presentations was random. The 
absence of any systematic intersymbol influence be- 
tween successive chords precluded any longer ae 
improvement through learned predictability since, i 
this case, cach of the possible stimuli was sap * 
likely to occur at any point in the sequence. In 0t Ja 
terms, the stimuli signals contained no redundant 
so the gross data rate and the net rate of intend 
tional entropy were equivalent. Finally, because $° ES 
pacing was uscd, the number of erroneous respons 
made was negligible. 


Procedure 


In order to rank the chords according to spi 
ing reaction times, a stimulus tape using all 31 sate 
ferent chords at random was used to generate E als 
terns. The daily test periods and recording px m 
were the same as during training. From the Los ne 
2,660 measured reaction times, each chord ea T 
signed a rank number according to its position time 
list of increasing reaction times. The reaction 
was measured between the appearance of the 
stimulus pattern and the complete selection x 
corresponding chord. The data for cach euius 
totaled for the complete duration of the P from 
since an operator’s behavior might vary 1076 
day to day. . ion 

In the second part of the experiment, pium 
rates were measured using selected groups of Seid a 
lus chords, Four groups were defined which im per 
maximum of one, two, three, or all five Be quil 
hand. They appeared in random sequence wit E js 
frequency. The appropriate information mea ices 
the logarithm of the number of possible C 


[ the 
was 


i AREA f the 
which is the average entropy per pattern O 
source, i.e.: 
H = log: (number of choices) 
" is obtaintc 
The information rate (bits per second) is o mee 


b 
from the product of H and the average nnd by 
responses per second. The latter was men minute 
counting the total number of responses in 5 i 


à and ! 
tests for the cases involving only one hand, 


Performance on a Chord Keyboard 


RANK CHORD RANK CHORD 
| Sse 0 moe 
2 0---- e 0--0- 
$ ——0-— B—-O00 
4---O- 9 O--OOQ 
s-Q--- a 00--0 
€ OQ z2 0-0-0 
7 0O---0 2-0--0O 
& —OQ-—-— 0-000 
s 000-- z--0-0O 
(0 —000- 200-OO- 
"--00- 2#00-0- 
e ---00 2000-0 
3 00000 z00-00 
4 -0000 z#-00-0 
sS 0000- s-0-00 
i =Q= 0s 

Fic. 2 Chord rank chart. (The fingers used in 


the right hand are indicated.) 


S-minute tests for the cases with two hands. The six 
different experiments are listed in Table 1. 


RESULTS AND DISCUSSION 
Chord Ranking 


The average ranking of chords by increas- 
ing reaction times is shown in Figure 2, where 

€ fingers used in each chord are indicated 
Or the right hand. The average correlation 
Coefficient among operators for all 31 chords 
S Di, hence, the agreement on the overall 
king is excellent. However, there is no sig- 
nificant agreement among operators concern- 
Ng the rank of any chord relative to those 

mediately adjacent on the list. For exam- 
De, if We confine our attention to the first five 
qae list, the correlation in the ranking by 
«tent operators is not significant; but if 
nlt or more chords are taken the agree- 
ange i5 good. This is clear from the maxima 
nd minima for the different subjects in Fig- 
e3 


are he main features of the chord rank dert 
list Pot unexpected, thus, the top five ks m 
teq 'nvolve only one finger each. The chor 

tain ring the longest response times all ai 
be h Patterns requiring one or two pe apis 
are © Off while their neighbors on each side 


"Sed, In between (ranks 8 to 15) lie those 


305 


chords consisting of a consecutive group of 
fingers. 


Distribution of Response Times 


If each chord is assigned a “cost” propor- 
tional to its associated reaction time, then the 
distribution of Figure 3 is obtained. These re- 
sponse times include a 0.1-second delay in the 
presentation of the stimulus pattern. In pass- 
ing, it was observed that about 65% of the 
total time required to complete the chord re- 
sponse could be assigned to latency, where 
latency is taken as the time between stimu- 
lus presentation and the first indications of 
response. Figures 2 and 3 give the relative 
ranking of chords by response time and show 
quantitatively how the cost in reaction time is 
distributed over the different chords. 

The quantitative difference in performance 
using various subgroups of chords can be pre- 
dicted from Figure 3. If it is assumed that the 
cost distribution of Figure 3 represents the 
motor response times of each chord and that 
the effect of choice when there are different 
numbers of possible alternatives is negligible, 
then the expected response time would be the 
average cost (in seconds of reaction time) de- 
termined from the distribution for the given 
subgroup of stimulus patterns. The “predicted 
response time” of Table 1 is the average of 
the costs of those chords used in a particular 
group of stimulus patterns plus the 0.1-second 
dead time between presentations. To extend 


RELATIVE REACTION TIME 


1 10 15 20 25 

CHORD RANK 80 

Fic. 3. Distribution of reaction times. 
ranked according to Figure 2 and reacti 
expressed relative to the median of 1 


(Chords are 
on times are 
-16 seconds.) 


306 


H. C. Ratz and D. K. Ritchie 


TABLE 1 


E CHORD 
COMPARISON OF THE OBSERVED DATA RATE FOR SUBGROUPS OF CHORDS WITH THAT PREDICTED By THE CHO 
RANKING EXPERIMENT 


Observed Predicted , 
Response Time Observed Response Time Predicted 
Patterns T Data Rate Ç Data Re 
Stimulus H seconds F (seconds/ 
Experiment Chords (bits/stimulus) response) (bits/sec.) response) 
One Hand - 
A 1-finger chords 2.32 94 24 1.05 ae 
B 1-, 2-finger chords 3.91 1.07 3.7 1.13 Hs 
eS 1-, 2-, 3-finger chords 4.64 1.15 ERI 1.18 i a 
D All chords 4.95 1.20 41 1.20 : 
Two Hands m. 
1 finger per hand 4.64 .08 2.3 (2.1) (2.2 
E All chore 9.91 2.63 3.8 (4.1) (4.1) 


this to two hands, the distribution for all 
961 allowable 10-finger patterns would be re- 
quired. Lacking these data, the figures entered 
in parenthesis in the table are merely double 
the expected reaction times for one hand. 
Table 1 shows the results of experiments 
using only selected groups of chords and 
measuring the average response time for the 
group in each case. The “Observed Data 
Rate" is obtained by dividing the H for the 
group by the observed average response time 
T. Since H measures the stimulus information 
in binary units, the result is an information 
flow rate expressed in bits per second which 
neglects the possibility of errors. As is usual 
in this type of human Operator experiment, 
the higher information rates correspond to the 
stimuli giving the greater choice and having 
higher values of H. While this conclusion is 
valid here for the cases involving one handed 
operation, note that it does not necessarily ex- 
tend to the use of two hands. Comparing Ex- 
periment A with E, and D with F, it is seen 
that an operator using only one hand may 
perform at somewhat better than twice the 
speed of one using both, and therefore having 
a greater resulting information handling ca- 
pacity. Thus the increase in choice derived 
from using both hands on 10 keys was more 


than offset by the slower response of the op- 
erator. 


Information Rates 


The agreement between the observed re- 
sponse time T and the expected average cost 
C indicates that the reaction time for a given 
subgroup of chords is very nearly the average 


of the reaction times for its individual mem- 
bers. Thus the assumption made in comput 
ing C is substantially supported, namely, tha 
the reaction times are highly characteristic ° 
the individual chords and relatively indepenc- 
ent of the number and remaining members i 
the set. However, there is a tendency for t E 
response times to be shorter (and the ins 
mation rate higher) than that predicted hes 
the amount of choice in the set become 
smaller; that is, the effect of the amount 9 
choice on the overall response time may i5 
present but is secondary to the average d Ao 
reaction time. Since the basis of all pred! z 
tions was reaction time measurements usin 
all 31 chords, the agreement in Experiment 
is an arithmetic check. e 
In Table 1, the highest data rate E e 
tained using all chords on one hand. F " 
ever, this assumes that the 31 different stim E 
patterns are used with equal ee 
method which is not optimum in view 0 D 
nonuniform cost distribution function. O sd 
ously, the more difficult chords should be wae 
less often than the easier ones. The fred a 
distribution required for optimum perfo A 
ance can be determined from coding theo 


Application of Coding Theory 


The problem of communication with TA 
equally weighted vocabularies has been oi 
amined as a game of strategy by Man REC 
(1954). By interpreting the average Poi. e 
time to each of the chords as its cost, ae jven 
can be specified which will transmit a B 
amount of information in minimum time- D 
der the assumption that the response tim 


a chord is independent of its predecessors or 
the amount of choice, the encoding which 
minimizes the cost can be essentially specified 
by the relative frequencies of use of the chords 
(Blachman, 1954). Since Table 1 shows this 
assumption to be approximately true, the 
maximum information rate will be achieved 
by assigning the easier chords to the more fre- 
quent messages. 

Suppose the chords are given a rank R — 1, 
2,...,81 as shown in Figure 2 with an as- 
sociated cost (Cr) from the experimental re- 
sults of Figure 3. The problem is to find the 
relative frequency of use for each chord—its 
probability pr—that will maximize the net 
information rate H/C. Here H is the weighted 
average entropy per chord: 


H = — Spr logs (pr) 
and C the average cost: 
C = prCr 
The probabilities are normalized so that: 
Xp =1 


. The solution of this variational problem is 
In terms of the “partition functions" of sta- 
tistical mechanics (Jaynes, 1959); the dis- 
tribution is of the form: 


_ LL eee 


pr 2-50, 


Where %, the maximum information rate, is 
* solution of the equation: 


3279-1 


OPTIMUM DISTRIBUTION 


4 UNIFORM FREQUENCY 


RELATIVE FREQUENCY 
o 
LÀ 


5 10 15 20 25 30 


F CHORD RANK 

16 one . 

p Comparison of uniform with optimum 
frequency distribution. 


Performance on a Chord Keyboard 


307 


Using Figure 3 as the description of Cp, the 
information rate obtained with optimum fre- 
quency distribution of the chords is: 


k = 4.33 bits/second 


The resulting distribution is compared in Fig- 
ure 4 with the uniform frequency method of 
Experiment D. Note that the frequency of 
occurrence of the most used chords R= 
is about 3.7 times that of the least used (R 
= 31), with the unbalance in favor of the 
easier chords. The improvement to be ex- 
pected over the results of Experiment D is of 
the order of 5% in the net information rate, 
if the frequency of use is optimum as shown 
in Figure 4. 
SUMMARY 


The 31 chords have been ranked according 
to their difficulty or “cost” as measured by 
the reaction times, and a quantitative meas- 
ure obtained for the distribution. Assuming 
that the motor system predominates over 
choice reaction time, these results are used to 
predict the expected average response times 
and information rates using subgroups of the 
chords that involve less choice; for example, 
a group may include only a few of the easier 
chords with the shorter response times. The 
agreement with experiment indicates that the 
results were determined primarily by the re- 
action times of the particular chords in the 
group, and that the effect of the amount of 
choice was secondary. On this basis, coding 
theory can be used to deduce the distribution 
of the frequency of occurrence of chords with 
the measured cost function that gives the 
maximum information rate through the man- 
machine link. 

For one hand the information rate increases 
with the complexity of choice of patterns, but 
this does not remain valid for two hands 
where the loss in response speed overbalances 
the increased choice of stimuli. It should be 
noted that these keyboard experiments were 
conducted using a highly compatible display, 
and that other forms of presentation might 
introduce effects of stimulus-response coding 
that would reduce the importance of the mo- 
tor reaction time. With this qualification, the 
results show that motor system constraints are 
predominant over choice reaction time in de- 
termining speed on a chord keyboard. 


308 


REFERENCES 

Arcuer, E. J. Identification of visual patterns as a 
function of information load. J. exp. Psychol., 
1954, 48, 313-317. 

BracuMaN, N. M. Minimum-cost encoding of infor- 
mation. IRE Trans., 1954, IT-3, 139-149. 

Crossman, E. R. F. W. Entropy and choice time: 
The effect of frequency unbalance on choice re- 
sponse. Quart. J. exp. Psychol., 1953, 5, 41-51. 

Dvorak, A., Merric, N. L, Drearev, W. L., & Ford, 
G. C. Typewriting behavior. New York: American 
Book, 1936. 

Erxiwp, J. I, & Forcre, C. D. Characteristics of the 
human operator in simple manual control systems. 
IRE Trans., 1959, AC-4, 44-55. 

Fitts, P. M. The influence of response coding on 
performance in motor tasks. In, Current trends in 


H. C. Ratz and D. K. Ritchie 


information theory. Pittsburgh: Univer. Pittsburgh 
Press, 1953. Pp. 47-75. 

Grecc, L. W. The effect of stimulus complexity On 
discrimination responses. J. exp. Psychol., 1954, 
48, 289-297. 

Hicx, W. E. On the rate of gain of information. 
Quart. J. exp. Psychol., 1952, 4, 11-26. 

Hyman, R. Stimulus information as a determinant 
of reaction time. J. exp. Psychol, 1953, 45, 1887 
196. 

Jaynes, E. T. A note on unique decipherability. IRE 
Trans., 1959, IT-5, 98-102. 

Licker, J. C. R, Man-computer symbiosis. IRE 
Trans., 1960, HFE-1, 4-11. 

Manpetsror, B. Simple games of strategy occurring 
in communication through natural languages. IRE 
Trans., 1954, IT-3, 124-137. 


(Received September 14, 1960) 


Journal of Applied Psychology 
1961, Vol. 45, No. 5, 309-317 


NOISE, THE *AROUSAL HYPOTHESIS," 
AND MONOTONOUS WORK' 


WILLIAM N. McBAIN 


San Jose State College 


What is the nature of the relation between 
noise and performance? Over the past 45 
Years there has been a discouraging lack of 
agreement among the proposed answers to 
this question. Berrien (1946) concluded that 
findings were at best "contradictory," while 
Melton and Briggs (1960) characterized re- 
sults from the previous 2 years’ research as 
"suggestive, but not easily integrated." In- 
deed, we have advanced little toward a gen- 
eral answer since Morgan (1916) first raised 
the question. 

The present study was undertaken in an 
attempt to clarify the problem by placing it 
in the larger framework of a theory dealing 
With the effects of environmental stimulation, 
Senerally, It combines the recent trend to- 
Ward specifying the kind of work and of 
Noise involved with a type of dimensional 
analysis of the particular classes of work 
‘nd noise used. Though the experimentation 
'S guided by these theoretical considerations, 

focus is upon the practical problem of 
te ucing errors in industrial output and of 
Selecting personnel least liable to such errors. 


No; 
ise and Arousal 


An attribute possessed by noise, in com- 
ai With other stimuli, is variability. Very 
es except in a laboratory situation, is an 
son ism exposed to a constant, € 
ha, Change in stimulation is the rule. 


v been proposed (Hebb, 1955) that 


Um ; i was sup- 

Donte; TéSearch reported in this paper 

mac by the Defence Research Board of Ganada, 
f Gant No. 9401-12, Project D77-94-01-12. 

at mathor, a taculty member of McGill University 

op a time, wishes to acknowledge the assistance 

Ang 2n Lyall, who not only aided in the planning 


a ] D 
Stime lysis, but also performed most of m e 
Deeg tal work on a very strenuous Scheele. 
Com t Air Officer 


PMmap ks are due to F. S. Carpenter, c 
Ae |a ing, "Transport Command, Royal aps 
; For a : A 
liis ap who made available the experimental fa 
and to members of his staff. 


changes in stimulation not only serve as 
cues to the organism, but also to activate 
or arouse areas of the cortex which are in- 
volved in the effective use of (or reaction 
to) such environmental cues. Marked impair- 
ments in coping behavior have been found 
(Bexton, Heron & Scott, 1954: Scott, Bex- 
ton, Heron, & Doane, 1959) when environ- 
mental stimulation has been severely re- 
stricted. 

The theoretical relation between effective- 
ness and the degree of arousal is not a linear 
one, but may be represented as an inverted 
U when the former variable is plotted on the 
vertical axis and the latter on the horizontal. 
For a specific person, performing a particu- 
lar job, a given degree of arousal should 
lead to optimal performance, while lower or 
higher arousal levels will be associated with 
reduced effectiveness. Since all sensory inputs 
are routed to the nonspecific arousal System, 
any environmental change, including changes 
in auditory stimulation, should result in in- 
creased arousal. 

While the degree of arousal is chiefly a 
function of, and dependent upon, the total 
variablity of the stimulus situation. there 
seems also to be a feedback situation which 
allows higher mental processes such as prob- 
lem solving or imagining to contribute, 
"Though these internal sources of arousal are 
apparently insufficient to maintain optimum 
effectiveness, our knowledge of them requires 
that they be controlled in experimentation. 


Noise and “Intelligibility” 


There is a further characteristic of noise 
which, though it has little to do with either 
variability or intensity, seems to have a great 
deal to do with the extent to which a Noise 
will prove distracting. "Distracting" is sed 
here in the sense of successful competition 
for attention and is demonstrated when loud 
shouts in the hall spoil the rapport between 


309 


310 


students and lecturer. For want of a better 
name this characteristic is referred to as “in- 
telligibility," which implies that it is a joint 
function of the stimulus and the person 
stimulated. Roughly, a noise may be consid- 
ered intelligible to the extent that it carries 
meaning within the frame of reference of the 
individual, rather than being perceived simply 
as a noise. While languages may exhibit this 
characteristic most clearly, the squeal of a 
dry bearing or the ring of a coin falling on 
pavement possess a high degree of intelligi- 
bility for appropriate auditors. 

In experimental situations one can control 
the intelligibility of the noise used by holding 
it at approximately zero (pure tones, white 
noise, etc.), or by keeping it at a low level 
which can be presumed to be approximately 
equal for all subjects. Music used in work 
situations is ordinarily restricted to instru- 
mental, and would be considered an example 
of the second approach. To the extent that 
sound is intelligible it can be presumed to be 
distracting, and hence likely to produce errors 


in the performance of a task demanding con- 
tinuous attention. 


Monotonous Work 


When applied to a work situation the two 
terms “boring” and “monotonous” have fre- 
quently been used as synonyms. In this study 
the term boring will have the usual reference 
to certain unpleasant subjective states often 
associated with the performance of jobs in 
which one has no particular interest. Monoto- 
nous, on the other hand, will be reserved to 
refer to a total work situation which offers 
little variability, and hence little opportunity 
for arousal. Monotonous, when used in this 
way, is also a characteristic of the stimulus 
situation and the indvidual. The work situa- 
tion is monotonous to the extent that it lacks 
variability, which is a relatively objective and 
(at least conceptually) quantifiable charac- 
teristic. Since it is not possible to quantify 


the relatively less important sources of vari- 
ability th: 


at come from the individual’s imagi- 
native and cognitive Processes, an experi- 
mental task should be such as to hold these 
at the lowest possible level. 


To be specific, in terms of the foregoing . 


William N. McBain 


considerations a task might be said to be 
monotonous to the extent that it (a) involves 
in itself and the total work situation as little 
variability or stimulus change as possible— 
variability should be repetitious to the extent 
that it does occur; (5) demands the indi- 
vidual's attention às continuously as possible, 
so that imagination and fantasy systems are 
unlikely to develop; (c) is sufficiently struc- 
tured and planned that a minimum of cog- 
nitive activity is required in its performance: 

The laboratory task developed to meet 
these criteria required each subject to ban 
print continuously, at a paced rate, à SC 
quence of seven pairs of letters. In comi 
parison to most procedures used to tes 
“perceptual vigilance” (Broadbent, 1953) 
this task has the advantage of having thé 
subject continuously and actively involve 
so that lapses of alertness can be traced p 
the exact time of occurrence. It also resem” 
bles more closely the machine paced cond! 
tions of performance which characterize many 
production line jobs. 


Arousal and Hypnotic Susceptibility 


Long distance truck driving, especially 
at night, is a job which comes close to EE 
ing the criteria of monotony given abo dy 
The Harvard School of Public Health stu 
of Human Factors im Highway Transp E 
Safety (McFarland & Moseley, 1954, re 
124-125) points out that hallucinations m". 
a recognized hazard of this job. The isola D 
studies (Bexton et al, 1954) showed t 
reducing stimulus input frequently lead a 
various types of hallucinations. Arm og 
inspectors have been reported as w asks 
asleep” at this highly repetitious visual The 
frequently relatively early in a shift. he- 
resemblance of these occurrences tO P tic 
nomena which are common in the hyp di- 
state suggested a relation between the, re- 
vidual's hypnotic susceptibility and. wr. 
action to a low variability stimulus situa pe 
Though this suspected relation. mop 
tangential to the effects of noise, its 1? 
gation was thought likely to add to nous 
understanding of behavior in a mono 
work situation. 


sti- 


sagon 1956: 
? E. C. Webster, personnel communication; 


Noise and Monotonous Work 


Experimental Hypotheses 


The major experimental hypothesis is im- 
Plicit in the arousal hypothesis, which pre- 
dicts lowered efficiency following the reduc- 
tion of environmental stimulation to low 
levels. In a monotonous work situation 
(defined in terms of a very low level of 
Stimulus change) the introduction of noise, 
by increasing variability and hence arousal, 
Should lead to improved performance? As 
previously noted, the noise introduced should 
be low in intelligibility, otherwise its distract- 
ing effect may mask any performance im- 
provement attributable to increased arousal. 

A second hypothesis asserted that a rela- 
tion would be found between a standard 
Measure of susceptibility to hypnosis and the 
amount of performance decrement in the task 
Occurring from conditions of noise to condi- 
tions of minimal variability. : 

An additional purpose of the investigation 
Was to assess the laboratory task itself as a 
Teliable index of interindividual differences 
in Performance under monotonous work con- 

itions. If demonstrated to be reliable, the 
resulting increase in behavioral sampling 
Should Jead to improved validity of selection 
Or such jobs (McBain, 1957). 


METHOD 


o tbjects, Thirty male members of the Royal 
TI Radian Air Force formed the original group. 
heir ages ranged from 19 to 36 years with a 
im of 25.8. All, except two of the three officers 
i the sample, were nonflying personnel. I 
de Unscorable records, mechanical failure, an 
RM 8 of groups reduced the N to 26 or less for 
ata presented. 
mi, CCedure 4 Each subject completed four 42- 
aute Sessions of work at a monotonous task. «a 
ne all subjects these were spaced at Meer 
Supp" k intervals. Prior to these sessions, E. 
di t Was administered a slightly eae a 
© Friedlander and Sarbin (1938) gue 5 
Depth, essentially an instrumen wi n 
Wüeept; Core reflecting the individual's reaction o 
qi ibility to a measured "dose" of hypnotic 


Mets (1954) unpublished suggestion to this 

i pone of the bases for the ores 

bout AES (1955) published a similar suggestion 
E i ". 

hy The cf raple um protocol is egt 

“ence Rect terminal report submitted to the 

"arch Board of Canada. 


311 


The Monotonous Work Task. A laboratory task 
was devised to fit the criteria of a monotonous 
job, as defined above. This consisted of hand print- 
ing continuously in sequence the seven pairs of 
letters: KT, LH, IM, XF, KV, TZ, IW. Each of 
the pairs consisted of letters which were easily 
discriminable for scoring purposes and in ordinary 
printing each required the same number of strokes, 
Prior testing demonstrated the sequence to be 
highly confusing. This, combined with pacing, helped 
to meet the task criterion of being attention de- 
manding. To reduce even further the cognitive 
aspects of this highly structured task, the list of 
letter pairs was posted so that the subject could 
consult it by briefly looking away from the print- 
ing surface. 

The requirement of minimum variability in the 
total work situation was met by several means. 
Intruding exterior noises were masked to a con- 
siderable extent by the continuous operation of a 
powerful exhaust fan during sessions, so that the 
only variable sounds in the experimental room were 
the rhythmic clicks of the timing and recording 
system. Visual variability was reduced by having 
the performance carried out in a black draped 
cubicle which was visually isolated from the rest 
of the experimental room. Kinesthetic variability 
was controled to some extent by cautioning the 
subject to move about as little as possible, 

At the beginning of the first work session each 
subject was given a pretest which required rapid 
copying of the letter pairs on a Prepared test blank, 
Three 36-second trials were given. The subject 
was encouraged to print as quickly as possible. 
For the first three of the succeeding 42-minute 
sessions each subject was Tequired to respond at 
three-fifths of his average rate on the two fastest 
pretest trials. On the fourth session each subject 
was required to work at a slower rate. 

The pacing apparatus Presented to the subject 
a 7/8" XI opening beneath which ran a con- 
tinuous tape of white paper. This advanced inter- 
mittently at the rate set for each subject, and in 
such a manner that the Previously printed Pair was 
immediately concealed. Each time the subject made 
an error (defined as “anything in the space except 
the next pair of letters in the sequence, printed 
legibly and completely”) he was required to signal 
his awareness of this by means of a foot pedal. 
If the subject lost his place in the Sequence he was 
required to begin again at the first Pair, and was 
informed that this counted as an error, Subjects 
were instructed to make as few errors as possible. 

Conditions. Three basic conditions were used in 
the four sessions. Under Quiet (Q) conditions the 
only sounds in the experimental room were the 
rhythmic clicks of the apparatus and the masking 
sound of the fan. During the entire period of the 
Noise (N) sessions a magnetic tape was played. 
Its content was considered to be low in intelligibi]- 
ity, since it was basically a recording of Speech 
played in reverse, specifically, Part of a labor- 
management conference. To add to its variability 


312 


there were superimposed short fragments of in- 
telligible conversation, music, meaningless sounds, 
etc. and the loudness was varied unsystematically 
irom short periods of complete silence and near 
inaudibility to a degree which approached the dis- 
comfort level. Each subject was exposed to the 
same portion of the tape recording at the same 
intensity. After being told that recorded noise 
would be used, the subject was instructed: 


Try not to let it affect your work at all, but 
just work straight ahead as you did before, trying 
to make as few errors as possible. Some people 
think this sort of work background relieves the 
monotony ... others think it is distracting. 
However, we'd like you to work straight on, 
paying as little attention to it as possible. 


The final session for all subjects was termed 
the Latency (L) session. It was the same as the 
Q session except that (a) the speed of response was 
slowed from three-fifths to one-half the pretest 
rate and (b) an additional response was required 
of all subjects, which involved a slight modification 
of the apparatus. 

A small lamp was added to the apparatus, po- 
sitioned about 6 inches beyond the printing aperture 
from the subject. Next to it was a push button. 
On 11 occasions during the session, at unsystematic 
intervals of 2-6 minutes, but the same for each 
subject, the lamp was turned on at exactly the 
time of the paper shift. The subject was instructed 
to "put it out by pushing the button as fast as 
possible, without either putting down your pen or 
missing a turn, and before you print the two letters 
in the square." To make this task more discrimi- 
nating the 6-volt lamp was supplied from a 3-volt 
current source, which had the effect of causing 
it to glow rather than to shine brightly. Each sub- 
ject was given five practice light signals prior to be- 
ginning the scored trials, 

All subjects worked under Q conditions during 
their first monotonous work session. They were 
then divided into two groups, equated on the basis 
of proportion of errors and total number of re- 
sponses required during the first 42-minute session. 
In the second session one of these groups performed 
under Q conditions, the other under N conditions. 
In the third session the conditions of performance 
were reversed, while in the fourth session all sub- 
jects worked under L conditions. 

During each session of each subject readings 
were taken at regular intervals of the level of 
galvanic skin response. Since these data have little 


applied interest they are not included in the present 
report. 


Treatment of Data. For each subject the follow- 
ing scores were computed: 


1. The Error Index (EI) consisted of the num- 
ber of response errors for each thousand responses 
required, and was computed for each session and 
for the four sessions combined. 


William N. McBain 


2. The Signal Index (SI) was the percentage of 
errors scored which was signaled by the a 
It was computed for each session, and for the 1o 
sessions combined. 

3. The Latency Index (LI) was secured e 
the kymograph records of the final session. The d 
distances between “on” and “off” signals for the ies 
10 of the 11 trials were summated for each ee 
Since relative rather than absolute times were bs 
quired, this index was not converted to a time se 

4. The Hypnotizability Index (HI) was the e 
on the Friedlander and Sarbin scale. High ume 
indicate greater, and low scores lesser, susceptibil 
to hypnotic stimuli. The subjects were consider 
“high” or “low” on this index as they fell abo 
or below the median for the group. 


RESULTS 
Noise, Hypnotizability, and Performance 


The original experimental plan called sE 
groups matched on Session I to perform 0" 
Sessions II and III in a counterbalanced xi 
sign. With this intention thwarted by Ped 
verse apparatus and subjects, it was dec! ai 
to analyse the results of equated rather i 
matched groups, using a complex analy ie 
of variance design with three dimensions: c 
Condition (N vs. Q—correlated), Order " 
Conditions (NQ vs. QN—correlated), E. 
notizability (High vs. Low—uncorrelate n5 

Since measures under the two ae 
were available only for EI and SI ae 
separate groups were equated on these ub- 
measures for the first session, which all 3 i 
jects performed under the same (Q) “test 
tions. Tables 1 and 2 demonstrate that t 
groups were not significantly different. the? 
results of Sessions II and III were 


5 Cynthia Wimer planned and performed 
analysis presented. 


TABLE 1 


IxpE* 
EQuATING OF EXPERIMENTAL GROUPS ON ERROR 
(EI) PERFORMANCE IN SESSION I 
(N = 249) 
F 
Source of Variation ss df MS 
el 
Hypnotizability 42 1 Ec e. 
Order (of N & Q conditions) 95 1 799 2.64 
Interaction (H X O) 799 1 a 
Error 6,062 20 30. 
Total 6,998 23 2 


* Subjects 5 and 22 omitted. 


Noise and Monotonous Work 


TABLE 2 
EQUATING OF EXPERIMENTAL GROUPS ON SIGNAL 
INDEX (SI) PERFORMANCE IN SESSION I 


313 


TABLE 4 


ANALYSIS OF VARIANCE OF SIGNAL INDEX (SD 
PERFORMANCE IN Sessions II anp III 


Gy = 26) W = 26) 

Somers SS df MS F Source of Varia SS df Ms F 
Hypnotizability 720 1 720 1.06 Hypnotizability 9,545 1 9,545 6.64% 
Order (of N & Q conditions) 400044 0 <i Order (of N & Q conditions) 245 1 235 <i 
Interaction (H. X O) 2428 1 228 — 314 Condition (N or Q) 23 01 23 «1 
Error 14,908 22 678 H XO 1,601 1 1.601 1.11 

E - OXC 4.340 1 4,340 9,86% 
Total 17,796 25 HxC 1429 1 1429 — 2.56 
—— = = OXHXC 33 1d 3 «1 
Subjects 31,606 22 1437 3,26%% 
. Residual 9,086 22 440 
analyzed, with the results shown in Table "- siis. dn 
tal 58,2 
and Table 4 S 


3. (groups equated for EI) 
(groups equated for SI). 

In brief, the results from these analyses 
indicated: 


1. The difference in mean EI between N 
conditions (12.22) and Q conditions (16.39) 
of 4.17 errors per thousand required re- 
Sponses is significant beyond the 5% level 
of confidence. As measured by the making 
of errors, performance under conditions of 
Noise is superior to performance under quiet 
Conditions. 

2. 'The difference in mean SI between N 
Conditions (55.13) and Q conditions (56.45) 
of 1.32% of errors signaled is not signifi- 
ant. As measured by the signaling of errors, 
Performance under conditions of noise is not 
Significantly different from that under quiet 
Conditions, though the percentage of errors 


TABLE 3 

ANALysis or VARIANCE or ERROR INDEX (EI) 
PrnrogwANCE IN Sessions IL anp HI 

(N = 24°) 


ss df MS F 


1 
Tying 


<1 
tizabilit: 1 1 1 
in 3 <i 
Cosa CN & Q condition 333 1 — 333 n s 
lp ition (N or O) 200 1 209 i 
PR 1,781 1 1781 3.30 
a : & ' 21 1 21 «t 
T x H 26 1 26 <I 
Subjea X C 109 ! 109 2.43 
wens 10,804 20 540.2 12.04* 
M ($07 20 448 
‘otal a 


14,181 47 


Signigcant at .05 level, 
ificant at ‘or level. 


* Significant at .05 level, 
** Significant at .01 level. 


signaled under quiet 
greater. 

3. There is a highly significant interaction 
between Order and Condition in their rela- 
tion to SI. This can best be understood as a 
practice effect, i.e., there is a tendency for 
the signaling of errors to be more accurate 
under the performance condition experi- 
enced second. There is no such effect in re- 
gard to the making of errors. 

4. There is no difference in mean EI be- 
tween subjects with low and with high HI. 
However, the mean SI for subjects with low 
HI (69.34) and for subjects with high HI 
(42.24) showed a difference of 27.1 in per- 
centage of errors signaled, which is significant 
at beyond the 595 level. While persons who 
are least susceptible to hypnosis make no 
fewer errors than those who are most sus- 
ceptible, they do signal the errors made 
with significantly greater accuracy. 

The mean LI was determined in Session 
IV for subjects low in hypnotizability (29.8) 
and those high in this characteristic (30.9), 
The difference of 1.1, in favor of those least 
susceptible to hypnosis responding more 
quickly, was not significant. 


conditions is slightly 


Characteristics of the Performance Indices 


It seemed possible that a task of this gen- 
eral type might prove useful in the selection 
of personnel for jobs of an unavoidably 
monotonous nature. For this purpose it 
should demonstrate reasonably wide and con- 


314 


sistent differences among individuals. Table 
5 indicates that even after more than 2 
hours practice on this relatively simple task 
subjects are reasonably well distributed on 
both EI and SI. An examination of the table 
indicates a relatively consistent lowering of 
the mean errors and narrowing of the spread 
of scores in the EI which is not shown in 
the SI. It is thought that the latter observa- 
tion reflects the inadequacy of the SI as a 
statistic. As the number of errors becomes 
small a very small absolute change in the 
number signaled changes the SI tremend- 
ously. For example, a subject having only 
one error will secure an SI of 100 if he sig- 
nals it, while his SI is O if he fails to signal 
it. 

The reliability of the EI was estimated 
for Session I by the split-half method. The 
errors for successive 2-minute periods (with 
the exception of the first 2 minutes) were 
summated so that errors for Minutes 3,4; 
7, 8; 11, 12; etc. were correlated with 
errors from Minutes 5, 6; 9, 10; 13, 14; etc. 
This gave a coefficient of reliability of .84, 
which increased to .92 when the Spearman- 
Brown correction for length was applied. 

Since it is possible that errors in one 2- 
minute period may influence errors in the 
Successive one, and hence lead to a spuri- 
ously high coefficient, the errors in session 
II were correlated with those in session III. 
While it is true that half the subjects worked 
under different conditions in Sessions II and 
IIT, it may also be assumed that, even under 


TABLE 5 


Scores ror 26 Ss IN Four Monoronous 
Work SESSIONS 


Error Index (EI) Signal Index (SI) 
: Extreme Extreme 
Session Scores Scores 
M o M c 
Low- High- Low- High- 
est est est est 
I |302 219 0 97.4 |50.4 252 0 100 
II |15.9 168 0 58.2 406.66 286 0 100 
III |14.2 19.1 0 91.0 61.8 301 0 100 
IV |12.0 10.6 12 421 |52.5 3411 0 100 


William N. McBain 


N conditions the task is a monotonous One, 
though not as monotonous as under Q con- 
ditions. For 24 subjects errors in Session I 
vs. errors in Session III showed a correlation 
of .84, a reasonable finding for a test-retest 
reliability after one week, considering both 
the disparate conditions and the restriction 
of range following Session I. Both these WC 
ings indicate a sufficiently high degree O 
reliability for the EI as a performance meas- 
ure to make its use feasible as a predictor 
in an industrial situation. 

The reliability of the SI, determined as 4 
test-retest coefficient between sessions II and 
II, was found to be .56, a result most 
probably attributable to the instability of the 
index when small numbers of errors are made- 
From this result it is clear that the SI com- 
puted in this manner is not an adequate pe! 
formance measure for prediction in a per 
Sonnel situation, even though it has been 
shown to discriminate between groups Se 
lected on the basis of hypnotizability. 

The reliability of the LI was estimated 
by summating the latencies for signals given 
at Minutes 3, 9, 17, 33, and 41, and af 
Minutes 5, 15, 23, 27, and 35 for eac 
subject and correlating the resulting pair 
of totals. Each of the two sums containec 
two latencies secured from‘ysignals gien 
after 2 minutes of uninterrupted work, x 
from a signal given after 4 minutes, Hio 
two from signals given after 6 min 
though these intervals occurred in gue. 
orders in the two groups of signals Fe 
mated. The resulting split-half reliability, 
corrected for length, was found to be 77 
85, which is of a magnitude usually core 
ered sufficient for use in prediction as W 
as for comparison of group results. de 

The interrelation among indices hee. 
termined by computing correlations W 
used mean measures for all four sessions js 
the EI and the SI and the single deter 
tion of LI from Session IV. The three me 
ures were all intended to be indication 
the “alertness” of subjects. Thus, altho de- 
they could be considered relatively we 
pendent in terms of experimental procedu” ^ 
it was anticipated that they would sho" 
reasonable degree of interrelation. For 


€ € 9À—— — Das———— ——————— JÓÜya—— —— ——á 


Noise and Monotonous Work 


Subjects EI was found to correlate with SI 
-28 and with LI .04 while LI and SI also 
Showed a correlation of .04. Since a correla- 
tion of .388 is required for significance at 
the 5% level with 26 pairs of scores, the 
hypothesis implicit in the selection of these 
measures of alertness is clearly untenable. 
The findings suggest that alertness is not a 
unitary characteristic in work behavior, and 
that persons who can be depended upon to 
perform adequately on a routine, repetitive 
task which requires continuous attention 
(EI) are not necessarily those who will be 
most critical of the adequacy of their per- 
formance (SI) or who Will react most quickly 
to an irrelevant interjected stimulus (LI). 


Discussion 
Arousal and Vigilance Tasks 


The majority of recent studies concerned 
With the effects of noise on human perform- 
ance have secured data from tasks of “vigi- 
lance,” in which the subject typically had 
to respond “only to very infrequent signals 

ut may have to watch for them over long 
Periods” (Broadbent, 1958, p. 108). Two 
Other features are characteristic of the stud- 
tes summarized by Broadbent. They are 
Primarily concerned with decrements in per- 
formance over the length of the period during 
Which exposure to noise occurs. More im- 
Portant to the present study, they usually 

Se relatively homogeneous noise, specified 
to frequency band or spectrum and as to 
tensity, Such noise adds no variability to 
th Work situation, but rather adds to its 
ly Otony, since the proportion of total stimu- 
is 9! Which is nonvariable or rm 
wigheteased, It is tempting to mintan nat 
tus; "Session performance €i ‘xd 
More S under such circumstances may 


dequately explained by a gradual re- 


e 

de a 

vaction in arousal attributable to the low 
“ability of the work situation than by an 

e 


1959) to “psychological stress" (Jerison, 


e Present study lends no support to such 
Ng) tion. Th all except the first (or ne 
"ro, “ssion, Whether or not noise was used, 

Creased with time. A graphic repre- 


315 


sentation of errors summated for each 6- 
minute period suggests both that the slopes 
of error curves for the Q, N, and L sessions 
are not significantly different and that the 
curve for the N session shows a considerably 
smoother rise. Incomplete records did not 
allow a more accurate check of these ap- 
parent trends. 

Since there is no evidence that within- 
session performance decrement is affected by 
highly variable noise, it must be concluded 
tentatively that the arousal hypothesis does 
not account for the decrease in accuracy 
which is of most interest to investigators of 
vigilance tasks. Indeed, it seems that the 
increase in stimulus variability brought 
about by the noise acted uniformly through- 
out the session to reduce significantly the 
Proportion of errors. Whether or not the 
error curve rises more uniformly under con- 
ditions of variability is a question for fur- 
ther investigation. 


Alertness and Hypnotic Susceptibility 


One way of interpreting the signaling of 
errors is as a measure of the extent to 
which the subject is critical of his perform- 
ance. While peripheral to the question of the 
effect of noise on performance, the finding 
that those who signal errors most poorly 
are most susceptible to hypnosis deserves 
further mention. A critical attitude is a rec- 
ognized deterrent to the induction of hyp- 
nosis. If a truck driver took such an attitude 
to the presence of a colonial house in the 
middle of the road (McFarland & Mosely, 
1954, p. 124, Case 2) it is doubtful that an 
hallucination could become fully developed. 

The magnitude of the relation between 
susceptibility to hypnosis and awareness of 
errors is indicated by the biserial correlation 
between high or low HI and the composite 
SI for the two sessions performed under Q 
conditions—7y;, —.46 (p= < 01). When the 
same statistic is computed, using a composite 
SI which also includes scores from the N 
sessions the resulting correlation is not sig- 
nificant (rvis = .28). This suggests that sus- 
ceptibility to hypnosis is most likely to be a 
performance relevant variable under highly 
monotonous work conditions, 


316 


Utility and Limitations of the Findings 


The experimental monotonous task used 
in this study has been demonstrated to pro- 
vide stable measures of errors (EI) and 
latency (LI), while further development 
might provide an equally adequate index of 
“awareness” of errors. These may reasonably 
be expected to predict performance in 
monotonous industrial jobs. However, there 
are certain considerations that suggest they 
be used and interpreted with caution. 

The first of these is the finding from the 
present study that alertness appears not to 
be a unitary characteristic of the individual. 
A very careful criterion analysis must there- 
fore precede attempts to validate such pre- 
dictors, so that the one most relevant to 
desired performance may be used. 

The second consideration has to do with 
the relation between the reaction of the 
individual to understimulating environments 
and his reaction to those which provide over- 
stimulation. We have at present no indica- 
tion of whether a relation exists between 
these two adverse reactions or, if it does, 
whether it is positive or negative. A positive 
relation would indicate the possibility of 
hiring workers whose performance would be 
relatively independent of conditions of en- 
vironmental stimulation. If a negative rela- 
tion exists it would suggest caution in hiring 
for monotonous jobs solely on the basis of 
this sort of predictor. In an extreme situa- 
tion we might find our work force virtually 
helpless in the face of the overstimulation 
brought about by an emergency. 

A third factor which should be investi- 
gated further is the question of adaptation 
to stimulus variability. Variability, by its 
very nature, cannot be augmented and pro- 
longed indefinitely. Hence it will be im- 
portant to know whether a certain level of 
variability continues to be associated with 
improved performance over an extended pe- 
riod of time. 

Still another question to ask is whether 
predictive and concurrent validities for 
monotonous types of jobs and their predic- 
tors would be at all similar in magnitude. 
It may be that attrition or self-selection in 
monotonous jobs is an even more potent 


William N. McBain 


factor in range restriction than we have to 
deal with in most predictor variables. Hence 
we should be alert for the situation in e 
a barely significant relation, as ecu 
on present employees, becomes a highly ^ i 
cient predictor when used with applican t 

Finally, one must consider the queni a 
motivation, since variations in Mrd 
may play a critical part in the sap 
performance quite distinct from that played 
by the physical conditions of work. Consider 
able evidence has accumulated to show tha 
unfavorable conditions of work may be a 
pensated for within limits by increasing g^ 
incentive (Pepler, 1954). This suggests tha 
unfavorable conditions may do more to W- 
hibit a workers desire to achieve greater 
production than his ability to do so. Sims 
larly, variability introduced into the monos, 
nous work environment may make it possib ^ 
to perform more accurately without ensume 
that more accurate work will result. In aY 
short-range experiment there are bound | $ 
be idiosyncratic motivational effects ose 
are atypical in terms of a routine work a 
ation. For these several reasons à test a 
some sort of routine production situation hes 
be necessary before the practical implicatio 
of this study can be assessed adequately: 


SuMMARY 


" he 
The contradictory findings concerning hs 
effects of noise on performance may e the 
onciled by a dimensional analysis 9 task 
characteristics of the noise and of the "i. 
involved, and an attack upon specific A 
sions, based upon a theoretical rather » d 
a strictly empirical approach. m low 
the “arousal hypothesis," noise wae for 
in “intelligibility” (or distraction va eu 
the individual), while at the same tne prin 
high in variability, should enhance pe' 
ance in a monotonous task. jevised 
The laboratory task which was € for à 
to conform to the criteria established of | 
"monotonous" work situation persi 
42-minute periods of handprinting po dete!" 
letters in continuous sequence at Pre ups 
mined, individually paced rates. Two LA jn 
of male subjects equated on penne the 
a first session, were exposed to noise der i” 
Specified type in counterbalanced oF 


(0 c ——PJCraób€———— A — S 


Noise and Monotonous Work 


either a second or third similar session. A 
Significant improvement in performance in 
lerms of errors made in the task was as- 
Sociated with exposure to noise. Awareness 
OÍ errors, as indicated by their signaling, 
Was unchanged. 

The similarity of phenomena associated 
with low stimulus environments and the 
hypnotic state suggested the obtaining of a 
Standard measure of susceptibility to hyp- 
nosis from all subjects. Signaling of errors 
was significantly more accurate by those 
least susceptible to hypnosis, though the 
making of errors by subjects of low and high 
susceptibility was not significantly different. 

In a fourth session a measure of latency 
of response to a series of unsystematically 
Spaced visual signals from a low intensity 
light was recorded. The three measures of 
"alertness"—— errors made, errors signaled, 
and latency—showed no significant inter- 
Correlation, suggesting this characteristic is 
Not unitary. The reliability of error and 
atency measures was sufficient to make them 
Practicable in a personnel selection situation. 

he implications of these findings are dis- 
cussed in terms of their relation to “vigi- 
ance” tasks, and their applicability to per- 
*onnel selection practice. The question of 
ng-term adaptation to environmental vari- 
ability is also considered. 


REFERENCES 


Bran F. K, The effects of noise. Psychol. Bull., 
1946, 43, 141 161. 


317 


Bexton, W. H., Heron, W., & Scorr, T. H. Effects 
of decreased variation in the Sensory environ- 
ment. Canad. J. Psychol., 1954, 8, 70-76. 

Broapbent, D. E. Noise, paced performance and 
vigilance tasks. Brit, J, Psychol., 1953, 44, 295- 
303. 

Broappent, D. E. Perception. and communication. 
New York: Pergamon, 1938. 

Dzzsz, J. Some problems in the theory oi Vigilance, 

yehol. Rev., 1955, 62, 359-368. 

FRIEDLANDER, J. W., & SanbiN, T. H. The depth of 


hypnosis, J. abnorm. soc. Psychol, 1938, 33, 
453-475. 
Hess, D. O. Drives and the C.N.S. (conceptual 


nervous system). Psychol, Rev., 1955, 62, 243- 
253. 

Jertsox, H. J. Effects of noise on human periorm- 
ance. J. appl. Psychol., 1959, 43, 96-101, 

McBars, W. N. Use of psychological constructs for 
improving selection test validity. Canad, X. 
Psychol, 1957, 11, 164-170. 

McFarranp, R. A, & Mosrrrv, A, L. Human fac- 
tors in highway transport. safety. Boston: Harvard 
School of Public Health, 1954, 

MzrrrON, A. W., & Bnrccs, G. E. Engineering psy- 
chology. Annu. Rev. Psychol., 1960, 11, 71-98. 
Morcay, J. J. B. The overcoming of distractions 
and other resistances, Arch, Psychol., NY, 1916, 
5, 1-84. 
PrpeLER, R. D. The effect of climatic factors on the 
performance of skilled tasks by young European 
men living in the tropics, (APU 199/53) Cam- 
bridge, England: Applied Psychology Research 

Unit, 1954. 

Scorr, T. H. Intellectual effects of perceptual isola- 
tion Unpublished doctoral dissertation, McGill 
University, 1954. 

Scorr, T. H., Bexton, W. H., Heron, W, & Doaxz, 
B. K. Cognitive effects of Perceptual isolation, 
Canad. J. Psychol., 1959, 13, 200-209. 


(Received September 19, 1960) 


Journal oj Applied Psychology 
1961, Vol. 45, No. 5, 318-324 


A VERIFICATION SCALE FOR THE STRONG VOCATIONAL 
INTEREST BLANK, MEN'S FORM 


ROBERT W. FILBECK 


University of Nebraska 


It is generally recognized that an occasional 
subject will respond to paper-pencil inven- 
tories in an invalid manner. Some subjects 
may fake their responses. Others will mark 
their answer sheets haphazardly or at random. 
Carelessness, improper motivation, and failure 
to comprehend inventory items or marking in- 
structions are the more frequent reasons for 
invalid responding of a random or haphazard 
nature. 

In view of the factors responsible for ran- 
dom responding being uncontrollable and un- 
detectable during most test administrations, it 
seems advisable to devise, if possible, a means 
of identifying profiles derived from haphaz- 
ardly marked answer sheets. This is particu- 
larly true of the Strong Vocational Interest 
Blank, Men’s Form (SVIB), since this in- 
strument is one of the most frequently used 
instruments in counseling and in research. 

The problem posed here is not new. Strong 
(1943) determined, through die tossing, the 
characteristics of profiles derived from ran- 
domly marked SVIB answer sheets. Chance 
levels on each scale are denoted on the profile 
sheet as shaded areas. The vocational inter- 
ests of dice then will fall within the shaded 
areas. The vocational interests of some real 
people also will fall in this area and the coun- 
selor is unable to determine with any reason- 
able degree of confidence whether a given 
profile is the result of chance answering or 
is actually a validly undifferentiated interest 
pattern. 

Authors of other inventories similar to the 
SVIB have approached the problem of identi- 
fying random responding through construc- 
tion of special scales (Callis, 1948; Hath- 
away & McKinley, 1945; Kuder, 1956). Such 
scales, termed “validity” or “verification” 
scales (V scales), were constructed by scor- 
ing for item responses chosen rarely in real 
testing and significantly more often by chance. 


(The converse was true for the Kuder V 
scale.) 


AND 


ROBERT CALLIS 


University of Missouri 


THE PROBLEM 


The purpose of this investigation was tO 
construct and validate a V scale for the SVIB. 
The V scale construction procedures approx!" 
mated those employed by Kuder, Hathaway 
and McKinley, and Callis in developing "rare 
response" scales for other inventories. 

Since the rationale underlying a rare I€ 
sponse scale defines deviant scores in the 
chance direction as being indicative of either 
random responding or of highly unusual inter- 
ests, an attempt was made to construct a scale 
that would provide a range of scores that 
would be obtained infrequently either in te 
student population or from random respone 
ing. This “genuine but unusual” level woul 
fall at an intermediate position to the typi? 
and chance levels on the V scale. It was bY: 
pothesized that the intermediate level wou 
be provided by including items in the $C y 
that were rarely selected by college student? 
but of a lesser degree of rarity than common 
required. For this reason, the criterion or 
rarity was a difference between obtained 2D 
chance response frequencies significant at +, 
.10 level rather than the more stringent CP 
teria of the .05 and .01 levels. 


PROCEDURE 


The construction sample was composed of Mo 
freshmen enrolled in the University of Missout e 
ing the 1955-56 school year. The total sample b 
divided into eight division-by-sex subsamples 
follows: 


MALES FEMALES 
Arts and Science Arts and Science 
Agriculture Agriculture 
Education Education 
Engineering Nursing 


" on^ 
Each subsample, except males in Education; stable 
tained 25 subjects. Only 17 subjects were a 
for inclusion in the Education males subsamp f WE 

The SVIB had originally been administer? 10 
group of approximately 400 freshmen for pu the 
of a larger study, and the SVIB answer sheets 
construction sample were selected by random 
from this population. 


318 


Verification Scale for SVIB 


The first step in isolating rare item-responses con- 
sisted of contrasting the response frequency for the 
total construction sample with the frequency ex- 
Pected from chance (33%, since there are three item- 
responses for each item). The index used was the phi 
Coefficient (Jurgenson, 1947). The criterion for rarity 
Was a phi significant at the .10 level (6 = .127), with 
the obtained frequency being less than the chance 
frequency. 

Next, the response frequency for each item re- 
sponse was contrasted with chance expectancy for 
each division-by-sex subsample. The criterion for ac- 
Ceptance here was chance írequency exceeding the 
obtained frequency by 10%. 


RESULTS 


These procedures yielded a V scale of 103 
item-responses, all unit weighted in the posi- 
tive direction, and distributed among 97 items. 
The SVIB item numbers and item responses 
Scored on the V scale key are listed in Table 1. 


Validity 


Initial validation of the SVIB V scale was a 
lest of the ability of the scale to discriminate 
etween genuine and chance marked answer 


TABLE 1 
SVIB Irems AND RESPONSE CATEGORIES 
Scorep oN THE SVIB V SCALE 


E. 1 346. 3 
To 204. D 261. D 
pé 1 205. D 262. D 356. 3 
2l 206. D 264. D — 357. 1 
m L 207. D 265. D 358. 3 
d L 208. D 268. L — 359. 3 
d ; 360. 1 
pe L 227. D 270. L 0. 
a 271. D 363. no 
E mn 272. L — 367. no 
nl im oc 3 L — 370. ?,no 
9p Y 236. D 273. L 10. ?, 
iy b 237, L 275. L 371. no 
3 e: 
liy D 238. 1 276. D 372. mo 
u D 239. L 277. L 375. no 
gy DD 279. D 376. no 
` mE 1. no 
us LD 55 p — 29 D 38 
i; D — 5 p 29. I $38. no 
es 383. no 
E» mp x1 9 
TN 251. L 315. 1 — 38A no 
oy 252. LD 317, 3 38» no 
Wr 253 L — 325. 1 386. no 
ls L 254. L 327. 2 387. n 
83, D 255. D 329. 3 388. i 
He D 258. L 337. 3 2 p 
ES x5 2 Sew 
260. L 342. 3 i 
400. no 


319 


sheets. Since, with large numbers, relatively 
small scale distances between means will be 
statistically significant, a difference-between- 
means criterion for validity was not used. In- 
stead, percentage of overlap of the two cri- 
terion distributions was considered a proper 
index of validity. If the two distribution 
means were clearly separated and overlap 
was so slight as to permit rather unambiguous 
dichotomization, the scale would be consid- 
ered valid for the purpose stated here. 

The cross-validation sample was composed 
of 908 subjects; 409 male and 249 female 
freshmen from the University of Missouri, 
100 male clients selected at random from the 
files of the University of Missouri Testing 
and Counseling Service, 100 male freshmen 
from the University of Minnesota; and 50 
males of upper class and graduate status from 
Central Missouri State College. The V scale 
mean and SD for the total cross-validation 
sample were 8.55 and 4.73, respectively. The 
mean and SD of the theoretical chance dis- 
tribution were computed to be 34.4 and 4.6, 
respectively. The two distributions are de- 
picted in Figure 1. : 

Examination of Figure 1 reveals the va- 
lidity criterion to be amply satisfied. The two 
distributions are clearly differentiated, more 
than five SDs separating the two means. 

While it is obviously impossible to estab- 
lish a cutting score which will permit perfect 
dichotomizing of V scores into real or chance 
categories, it is possible to sort answer sheets 
into real or chance groups with about 99% 
accuracy if the cutting score is set at V score 
equal 23. Just less than 1 in 100 chance-an- 


1The writers wish to thank Ralph Berdie for his 
cooperation in furnishing SVIB records from the Uni- 
versity of Minnesota. " t 
? Statistics of the theoretical chance distribution were 
computed from the following formulas: 
Ni+2N2 
dem NM 
SD- VNiPQ+N2PO 
where I; number of items contributing one item re- 
sponse 
?- number of items contributing two item re- 
sponses 
P-probability of chance marking of V scale 
item-response for each item 
Q=probability of chance marking of non-V 
scale item-response for each item. 


Robert W. Filbeck and Robert Callis 


OBTAINED 4 
bet 4 o mom THEORETICAL CHANCE 
90 
"d is | 
"E 1 \ 1 
o / \ 
z / \ 
w 60 / \ 
2 
I / \ 
ul į \ 
€ 45 } \ 
u \ 
/ 
/ \ 
30 \ 
X 
15 Se 
RON 
o —— r EL— ——À Te 
o 5 10 15 20 25 30 35 40 45 50 
V SCORES 


Fic. 1. Obtained and theoretical chance distributions on the SVIB V 


swered inventories will have V scores of less 
than 23, while 1.29% of the assumed genu- 


inely answered inventories score at this value 
or higher. 


Sex and. Academic Group Differences on the 
V Scale 


Although the procedures followed in the 
construction of the V scale held to a minimum 
the likelihood of obtaining V scale differences 
among student groups classified as to aca- 
demic affiliation or Sex, it was desirable to 
determine on a cross-validation sample if 
this, in fact, was accomplished. For this test 
the distributions of the four male and four 
female validation subsamples from the Uni- 
versity of Missouri were contrasted. The 
means and SDs of these subsamples are pre- 
sented in Table 2. 

The data in Table 2 indicate the construc- 
tion procedures to have been successful in 
providing a scale of item responses which are 
rarely selected by university students, regard- 
less of sex or curricular affiliation. However, 
even the small differences found here are too 
great to be charged to chance sampling ef- 
fects. Analysis of variance techniques yielded 


an F ratio of 2.21, significant at between the 
.04 and .03 levels. 


The finding of a significant value of F in 
this instance does not require any substantial 


sca 


le. 


Score because of his fa rricular affiliation ae 
ever. The differences .. ween groups are 7 
small, two raw scores separating the means ; 
the two extreme groups, to be interpretable 


the practical sense. 


Institutional Differences on the V Scale k 
sons Í 
To test the possibility that populations 


institutions of higher education other 


TABLE 2 


V Scare Means 


tha? 


L 
P j " ; V 
modification of interpretation of a client's 


is OF 
AND STANDARD DEVIATIONS 


Eicur University or Missourr CROSS- 


VALIDATION SAMPLES e 
Sample n M^ SD 
Males 7 | 
Arts and Sciences 159 9.5 i 
Agriculture 113 8.3 4d 
Engineering 137 7.6 53 
Counseling Clients 100 9.6 ia 
Females 
a 5.0 
Arts and Sciences 117 9.3 38 , 
Agriculture 14 9.7 52 
Education 84 8.9 43 
Nursing 34 8.8 50 
Total 758 8.9 s 
iffe 
* An analysis of variance test for significance of edd ES 
among these means indicated only a slight ter 
nificance (F =2.21; .04 <p 5.03). 


Verification Scale for SVIB 3 


the University of Missouri will score differ- 
ently on the V scale, the V scale distributions 
of the male samples from the three afore- 
Mentioned institutions were contrasted. The 
means and SDs of the three samples appear 
in Table 3. Determination of the likelihood of 
the sample differences arising by chance ef- 

| fects was by analysis of variance techniques. 
An F ratio of 2.76, not significant at the .05 
level, was obtained; offering no contrary evi- 
dence to a hypothesis of random sampling 
from populations of equal means. 


Reliability 


Internal consistency of the V scale was esti- 
mated from the total University of Missouri 
Sample, using Kuder-Richardson Formula No. 
3 (Adkins, 1947). An r of .66 was obtained, 
Which may be considered an underestimate 
Since the assumption of equal item difficulty 
Was not met. The desired degree of internal 
Consistency is difficult to specify. There are 
No data available which would suggest that 
Such a V scale should have high or low in- 
ternal consistency. . 

, D. Brantley (1958 unpublished) readmin- 

"tered the SVIB with standard instructions 

9a Broup of 25 males, allowing a 2-week in- 

*'Val between administrations. He obtained a 
Scale test-retest r of .80. 


Relationship of V Scale to Other SVIB Scales 


any sing the 100-male freshmen constituting 
brog, versity of Minnesota sample, ana 
bep, Ct-moment correlations were compute 
the een the V scale and each of the scales on 
low VIB profile. In the main, the r’s were 
' tss than half departing significantly 


Ve. TABLE 3 
ALE MEANS AND STANDARD DEVIATIONS OF MALE 


"AMPLES PROM ‘THREE INSTITUTIONS OF 


ASS HIGHER EDUCATION B 
| = z ———— 
S Institution n A u SE 
DUM Mi > - > s di 
y, Vergi, SSouri State College 50 p = 
Pi ity &5 459 
vergi? Of Minnesota 100 i 
Eo 509 83 476 
Aissouri 509 — 8. 
"rn. g 
Peitavas 77 f the dif- 
SQ a Y8is of yar; * significance of th E 
TSM esc enn itficated no sieniicant differences 


205) 


| 


b 
pare 


TABLE 4 
SIGNIFICANT CORRELATIONS BETWEEN SVIB 
PROFILE SCALES AND THE SVIB V SCALE 


Scale r 
Artist .24* 
Veterinarian ` J2r* 
Farmer .20* 
Aviator Paks 
Printer E 
Forest Service Man 22* 
YMCA Physical Director A d 
Personnel Director 28** 
Public Administrator 40** 
YMCA Secretary 26** 
Social Science High School ‘Teacher 97** 
City School Superintendent AO 
al Worker 28** 
Musician 47** 
CPA 26** 
Senior CPA .26** 
Advertising Man .20* 
Lawyer 23^ 
Author Journalist 30** 
Occupational Level .20* 


* p <.05, 
** p <.0 


from zero. The 7s which were significant (5 
< .05 and $ < .01) are listed in Table 4. 

Noting the tendency for higher V scores to 
accompany interests in the social welfare area, 
19 profiles having a Group V primary interest 
pattern were contrasted to those not so char- 
acterized. The “Group V" profiles had a mean 
V score of 7.1, the contrasted group had a 
mean V score of 8.8. Failure to obtain a dif- 
ference in the expected direction in this test 
would seem to indicate that no modification 
of V scale interpretation need be made on ac- 
count of measured interests. 


V Scale Norms 


Standard score equivalents of SVIB V scores 
are given in Table 5. Table 5 was derived 
from the V scores of the total cross-validation 
sample of 908 subjects. 

Because of marked skewness of the student 
distribution (see Figure 1) scaling based n 
normal curve procedures was inappropriate, 
so the T score equivalents in Table 5 were de- 
rived from an area transformation (Guilford, 
1950). 


TABLE 5 
Norms For THE SVIB V SCALE 


Raw Score T Scoret Raw Score T Score* 


0 0 17 65 
1 21 18 66 
2 27 19 67 
3 31 20 68 
4 36 21 70 
5 40 22 70 
6 43 23 71 
7 46 24 72 
8 49 25 72 
9 51 26 73 
10 53 27 74 
11 55 28 74 
12 57 29 75 
13 59 30 76 
14 61 31 76 
15 62 32 79 
16 64 33 81 


* Standard scores with a mean of 50 and a standard deviation 
of 10 obtained by an area transformation (Guilford, 1950). 


Interpretation of SVIB V Scores 


In the statement of the problem it was 
pointed out that use of the V scale should 
permit classification of SVIB answer sheets 
into three categories: genuine and typical, 
genuine but deviant, and chance or random 
response. From Figure 1, it is apparent that 
the genuine/typical and the chance categories 
are clearly differentiated. Also apparent is the 
existence of a range of scores near the cross- 
over point of the two distributions that in- 
cludes relatively few cases from either dis- 
tribution. 

Establishing cutting scores to define the 
limits of each category, in the absence of em- 
pirical evidence, has been approached arbi- 
trarily, relying on the experiences of authors 
of other scales. 

Earlier, it was demonstrated that the likeli- 
hood of mistakenly accepting a chance-an- 
swered inventory could be held to less than 1 
in 100 at the expense of rejecting approxi- 
mately 1.576 of the presumably genuinely an- 
swered inventories as being responded to at 
random. Ordinarily, rejection of slightly more 
than 1% would be deemed a reasonable 
"price" to pay for objective assurance of 


Robert W. Filbeck and Robert Callis 


valid testing. However, experience with the F 
scale of the Minnesota Multiphasic Person- 
ality Inventory would indicate that even this 
small percentage may be too great. Hath 
away and McKinley (1945) point out that 
high scores (T = 65-80) indicate unusual þe- 
havioral characteristics but do not invalidate 
the testing. They do not consider the testing 

questionable at less than T = 80. d 

Mindful of this experience, the suggeste 
cutting score between genuine and random 1® - 
sponding on the SVIB V scale is set at ? 
(T = 74). This will hold the likelihood of i" 
cluding a chance answered inventory tO e 
than 9 in 100, a reasonable confidence leve 
for the purposes specified here. 

The lower boundary of the intermediate oY 
"valid but highly unusual" level is more d! a 
cult to specify rationally or empirically. Hat 5 
away and McKinley use the T = 65 score ue 
a dividing point on the MMPI F scale. In t D 
absence of empirical data, a similar dividi” 
line is suggested here, thus defining as "hi£^ 7, 
unusual interests” SVIB profiles having 
scores 65 through 74 (raw V scores, 
through 28). » 

Suggesting the boundaries of unusual int 
ests does not, however, define the nature D 
the extent of “unusualness,” Carnes * has PY. 
vided some preliminary information in i | 
respect in reporting on the V scale distH? 
tion obtained from the SVIBs of 40 hospi? 
ized patients in a Veterans Administrat^. 
psychiatric hospital. He had personally T 
ministered the inventories under highly Tive 
vidualized conditions, thus giving subje 
assurance of valid test taking. The mea” 
tient V score was T = 70 (raw score 
and the patient sample SD was 14. o 
formation Carnes provides suggests or 
many individuals, high V scores reflect 3 an 
conflicting drives which tend to be disorf ye 
izing to the total personality and arè ts, 
determinative of behavior than are inter". 
In predicting the activities an individu? um 
a high V score would find satisfying, the ^ spd 
selor would probably place minimal Fuld 
ence on the SVIB profile as such, and ons of 
resort to other, more clinical, indicat? 
drives and motives. 


— 3. 
* C. D. Carnes, personal communication, 195 


Verification Scale for SVIB 


Effects of Faking on V Scores 


Although the SVIB V scale was not de- 
Signed as a device to detect faking, one in- 
vestigation has studied the effects of deliber- 
ate falsification of interests on V scores. 

D. Brantley (1958 unpublished) adminis- 
tered the SVIB to a group of 25 men enrolled 
at Central Missouri State College. The first ad- 
minstration was according to standard instruc- 
tions, the second to fake. The faking instruc- 
tions directed subjects to respond to the SVIB 
as though applying for a highly remunerative 
but intrinsically uninteresting job, to be se- 
lected from the occupations listed on the 
SVIB booklet. The V scores of the faked an- 
Swer sheets showed a mean increase of six 
raw score points over the sincere answer 
Sheets. Although this increase is significant 
(t = 2.56, p < .05), the mean faking score 
does not approach the unusual interest range. 
Faking, therefore, does not appear to be an 
hypothesis to be attached to higher V scale 
evels, 

SUMMARY 


A “verification scale” (V scale) has been 
Constructed for the Strong Vocational Interest 
lank, Men’s Form. Item analysis of the 
IB, based on the responses of 192 male and 
male subjects, isolated 103 item-responses 
vich were rarely selected by students. “Rar- 
ay» was determined by contrasting the ob- 
tained frequency of response with that ex- 
Decteq from a chance answering of the SVIB. 
© criteria for rarity were differences signifi- 
"t at the .10 level, for the total group, and 
76 difference between obtained and chance 
qe tuencies for each major curricular and sex 
Ou 


Cr Validation studies to date, utilizing a ii 
th, "Validation sample of 908 subjects, yie 
a 9llowing information: iE 
&r,  "Éhuine and chance response pr i 
Well differentiated by the SVIB V sc e 
Stop he differences among the mean : 
Ate 55 of curricular groups and between po : 
Ta, or, although there is an indication t ; 
Eun engineering tend to obtain slightly 


W 

er y 

3 Scores, 

y i " nd 
no No Significant differences were fou 


D the mean V scores of three male sam- 


323 


ples from different institutions of higher edu- 
cation. 

4. V scores in the range T = 65—75 suggest 
personality deviation of a nature requiring 
some modification of SVIB profile inter- 
pretation. 

The reliabilities of the V scale were esti- 
mated to be .66 by means of K-R Formula 
#3 and .80 by test-retest technique when the 
interval between testings was two weeks. 

Coefficients of correlation between the V 
scale and the SVIB profile scales show a sig- 
nificant value of r in 20 instances, although 
the 7’s tend to be small. It was determined, 
however, that no differences in V scale inter- 
pretation need be made because of measured 
vocational interests. 

On the basis of present information, coun- 
selors may use the V scale to establish threc 
general levels of scores: 

1. T scores of 0-64 represent a range of 
scores which typify answer sheets marked 
with attention and understanding and pre- 
sumably on the basis of “normal” interests. 
The interpretation of SVIB profiles with V 
scores in this range may proceed rather 
straightforwardly. 

2. T scores of 65-74 represent a range of 
scores which indicate an increasing likelihood 
of random responding or of deviant personal 
chracteristics. The interpretation of profiles 
with V scores in this range should be more 
tentative, exploring first the possibilities of 
random responding and then of strong con- 
flicting drives which will tend to negate or 
modify measured interests as predictors of 
future behavior. 

3. T scores of 75 and above are strongly 
indicative of random responding or of ex- 
tremely unusual interests. In these extreme 
cases the counselor would likely readminister 
the SVIB with attention being given to client 
understanding of inventory content and mark- 
ing instructions. V scores in this high range, 
if shown to be the result of attentive and 
comprehending responding, will probably in- 
validate use of the SVIB as a measure of 
interests. 

At the present stage of development it is 
appropriate to recommend that the V scale be 
included among the scales currently scored on 


324 Robert W. Filbeck and Robert Callis 


the SVIB. Its usefulness in establishing re- — Gvirromp, J. P. Fundamental statistics in psychology 


ree : A i New zs McGraw- 
sponse validity is suíficiently established and ana — (2nd ed.) New York: McGr 
research on the implication of atypical V HATHAWAY, S. R, & McKey, J. C. Manual for 
scores will be expedited. the MMPI. New York: Psychological Corporation, 
1945. 
Jurcenson, C. E. Tables for determining phi CO" 


REFERENCES efficients. Psychometrika, 1947, 36, 17-29. 
Apxins, Doroty C. Construction and analysis of Kuper, G. F. Manual for the Kuder Preference Rec- 
achievement tests. Washington, D. C.: United ord, Form C. Chicago: Science Research Associates 
States Government Printing Office, 1947. 1956. 


“| n v T T en. 
Carus, R. Change in teacher-pupil attitudes related Wc 2 si e Interests of men and wom 
to training and experience. Unpublished doctoral anford: Stanford Univer. Press, 1943. 
dissertation, University of Minnesota, 1948. (Received September 23, 1960) 


EN UNUM ee 


Journal oj Applied Psychology 
1901, Vol. 45, No. 5, 325-329 


PERSONALITY AND PRODUCT USE 


W. T. TUCKER axp JOHN J. PAINTER 


University of Texas 


Perhaps no subject in marketing has re- 
ceived greater attention in the past few years 
than the relationship between personality 
and purchasing behavior. All of the furor over 
motivation research is clearly predicated on 
the premise that such a relationship exists, 
although some reporters seem to assume that 
all persons are, at base, alike. Yet even here, 
the factors referred to as common to all 
persons are most often those which personality 
studies have shown to be variables rather 
than constants. For instance, the importance 
of fear of the father image, which is reputed 
to militate against the use of banking serv- 
ices, must be conceived of as varying with 
Some personality characteristic such as ego 
Strength or emotional maturity if it is not to 
influence all persons in a highly similar way. 

Talk about the importance of personality 
as a marketing variable has become common 
at advertising clubs and at marketing asso- 
Ciation meetings. The recent book by Pierre 

Tartineau (1957) contains a chapter entitled 
“An Automobile for Every Personality. 
Charles Cannell (Ferber & Wales, 1958) says: 
“Tt may be that the determination of airplane 
travel has something to do with basic per- 
“Onality characteristics such as personal feel- 
M8s of security or insecurity” (p. 10). And 
Ernest Dichter (Ferber & Wales) says confi- 

ently: “What we are searching for are py 
chological and personality elements len 
ti". have a dynamic effect on PU adi 

(odes toward a product" (p. 26). li 

i lity as one of the major 
fact... Views personality as dea 

Ors determining marketing beh - 
Stem the light of such points of x el 
tag, SUrprising that few efforts pem 
leri * to demonstrate that personality on 
the aes actually do influence js oat M. 
bla; ath of evidence on this poin dhe 
in part by supposition. First, si 
lear. Personality itself has not been very 

l formulated. Second, the instruments 
ed. Se y ea f per- 

‘onal, € for the ready classification © E 
UY types are few and generally suspect. 


Third, most self-respecting psychologists are 
apparently convinced that marketing behav- 
ior, pervasive as it may be, is of interest for 
commercial purposes only. Fourth, marketers 
probably have little understanding of the 
need for experimental evidence of their 
assumptions. 

Yet it would seem that there is much to be 
learned about both personality and a large 
segment of human behavior by such studies. 
Scott's (1957) study of motion picture pref- 
erences is perhaps of less interest to the movie 
producer than it is to the individual who 
wants a clearer understanding of the person- 
ality factors isolated by the Minnesota Multi- 
phasic Index. That these factors are less than 
completely clear is indicated by Scott's in- 
ability to provide a rationale for all of the 
significant correlations. And Eysenck's (Ey- 
senck, Tarrant, Woolf, & England, 1960) re- 
cent findings that rigidity and extraversion 
relate to the number of cigarets smoked by an 
individual may be as important to the under- 
standing of those characteristics as they are 
as a possible explanation of lung cancer in 
heavy smokers. 

The present study was undertaken to test 
the hypothesis that marketing behavior is 
related to personality traits. At the same time, 
it was expected that the location of significant 
relationships would throw additional light on 
the meaning of personality characteristics 
studied. 


METHOD 


The Gordon Personal Profile was administered to 
133 students of marketing along with a so-called 
Sales and Marketing Personality Index which in- 
cluded questions on the use of headache remedies, 
cigarets, chewing gum, deodorants, mouthwash, and 
other items commonly purchased by college stu- 
dents. Blind questions were interspersed to give the 
index the appearance of a personality or interest 
test. Results were then compared to determine the 
difference in personality trait scores for groups that 
professed to different rates of product use or interest, 
That the subjects accepted the index was indicated 
by the large number of students who asked after 


325 


326 


completing the forms if they could find out whether 
they would make good salesmen, advertisers, etc. 


Subjects 


The subjects were all students of the first course 
in marketing at the University of Texas. The great 
majority were juniors; a few were in the last semes- 
ter of their sophomore year, and others were in the 
beginning oí their senior year. Since the Gordon 
Personal Profile has different norms for male and 
female students, and, since the frequency of use of 
a number of products was clearly related to sex, the 
31 responses by females are not included in this 
report. Also, one subject was eliminated because he 
failed to fill out the Gordon Personal Profile com- 
pletely. While this group of subjects can hardly be 
characterized as representative of even such a limited 
universe as college juniors, for purposes of this 
study their only necessary characteristic was that of 
providing a diverse group of scores on the Gordon 
Personal Profile and reasonable diversity in response 
to questions about products, 


Test Materials 


The Gordon Personal Profile was selected as the 
personality test to use since it measures four charac- 
teristics which seem intuitively meaningful as com- 
ponents of the "normal" personality and since it is 
based on college student norms. The profile rates per- 
sons on the variables of ascendency, responsibility, 
emotional stability, and sociability. 

The form used to determine use of products or 
other marketing characteristics included nine ques- 
tions relevant to the experiments and seven blind 
questions. Most of the experimental questions re- 


ferred to frequency of use of a particular product, 
as in the following: 


How frequently do you experience a headache that 
requires a headache remedy (aspirin, Bufferin, 
Anacin, etc.) 


a. Never 

b. Once or twice a year 

c. About once a month 

d. More often than once a month, but less than 
once a week 


€. Once a week or more 


Questions of this sort were asked about the use of 
headache remedies, vitamins, chewing gum, tobacco, 
mouthwash, alcoholic beverages, and deodorants. Two 
other questions related to the readiness with which 
the individual accepted new styles or fashions and 
preference in automobiles, 


Blind questions were rather similar to those asked 
on interest tests: 


Which of the following positions in an organiza- 
tion would you prefer to hold? 

a. Secretary-Treasurer 

b. Program chairman 

c. President 


W. T. Tucker and John J. Painter 


d. Membership chairman 
€. Ordinary member, no office 


The list of 16 questions was pretested in order to 
insure their clearness and to make sure that multiple 
choice answers would elicit a reasonable spread of re- 
sponse. As a result of this pretesting, multiple choice 
answers were altered to fit the normal variations 1n 
frequency of use of various products. For instances 
the most frequent use of headache remedies indi- 
cated by answers was “once a week or more," while 
the most frequent use of deodorants was described 
as “more than once a day,” since the pretest demon- 
strated these to be common frequencies for heavy 
users. 


Procedure 


Subjects filled out both forms at a single sitting of 
about 20 minutes, answering the Gordon Persona’ 
Profile first and the Sales and Marketing Personality 
Index second. While subjects were asked to fill iP 
their sex, age, marital status, and year in school 0” 
the Gordon Personal Profile, names were not taken 


in order to encourage the greatest frankness in re- 


sponse, Each pair of tests handed out was numbere 
in advance. 


Instructions to Subjects 


Students in each class tested were given the follow- 
ing instructions: 


As you all know, one of the difficult problems 7? 
business is the determination of an individual's "E 
terests, or what kind of job he can do best. A C 
tempts to solve such problems have led to the de 
velopment of a number of written tests—some a 
which take an hour or more to complete. V 
have in front of you two rather new tests that s 
to accomplish this for certain marketing jobs m 
just a few minutes. We know that one of these d 
moderately successful. We are interested in whet s 
scores on the other are different or much the 5a^ 
Do these tests really measure the same things? 


ot 
To determine this, we need your help. We are Nc 


interested in your score as an individual but in z 
relationship of your score on one test with Y 
score on the other, 


e 
For that reason, we do not want your name on pui 
paper; we merely want you to answer the for 
tions honestly and conscientiously. Instructions 
each test appear at the top of the test. 


First, make sure the red number or. each of 
tests is the same. Then fill in your age, Sex; Gordon 
status, and year in school on Test #1, the Ge a 
Personal Profile. Then read the instructions fin 
test and answer the questions. When you i jn- 
ished, go directly to the second test, read $ ; 
Structions, then answer the questions. 

You will find that on both tests there ar 
questions where none of the answers See se 
right for you. Just pick the one that seems € 


y 
i 
' 


Personality and Product Use 


TABLE 1 


RESPONSIBILITY SCORES FOR GROUPS WITH 
SRENT PATTERNS OF USE oF MOUTHWASH 


Mean Number 
Response Score of Cases 
Never use mouthwash 7.26 31 
Quite infrequently 5.00 40 
Once or twice a week 4.50 16 
Once a day 3.90 10 ; 
More than once a day 6.50 4 


and do not worry about exact wording. Remember 
to read the test instructions carefully, since you 
have to answer each test in a somewhat different 
way. 


Analytic Method 


Results were analyzed by comparing the difference 
in mean scores on one personality characteristic for 
groups with different product use patterns. $ 

Table 1 shows the mean scores on responsibility for 
groups which answered the mouthwash question in 
each of the possible ways. 

While ves sonsibility seems to be inversely related 
to frequency of use of mouthwash, despite relatively 
high scores for the four persons who use mouth- 
Wash more than once a day, the number of cases in 
Some of the cells is too small for analysis of variance 
lo show a significant relationship. For this sea 
the last four groups were combined and compare 
Using the / test with those who reported never = 
Mouthwash. The resulting ¢ of 2.12 is significant a 

he .05 level. The F test for homogeneity of variance 
Was ignificant. 
his spem of analysis was used i each 

Í the products on each of the personality charac- 
teristics, with the point for division into a groups 

cing determined on the basis of scores and the num- 
ber of subjects remaining in each of vs keen 

t is entirely proper to question whet! E is po 2 

Scales of the sort used here should be dic nes 
ter Observing the means of each of the ca e - 
“ch dichotomization obviously makes it or an 
ViXimize the number of "significant" re ST Ed 
Where Possible, it should therefore be ae ana 
ape independent method should be used for 

etomization, : actif 
bro onships between personality i a 
ürbi "Ct use, it seems foolhardy to Ae Greene tase 
Su trary dichotomization method in fe Periods 
E it ju iod coul eas Se se that related 
to pare only extremes of product u at inet 
lo py Sonality measures and the cutting point 

t i H i d, for instance. 

qt h * median were arbitrarily used, 


I hown in 

Tabe Pens that dichotomizing e eutesorim and 

he b pur Combining the top Em ificant t. The 
"lom three does not lead to a sign 


327 


resulting quandary is more philosophical than statisti- 
cal. It seems to the authors that refusing to locate 
the cutting point that leads to statistically significant 
differences is the more serious error when dealing 
with the kind of problem discussed here. 


RESULTS AND Discussion 


A total of 36 comparisons (9 product 
categories x 4 personality characteristics) in- 
cluded 13 significant relationships. As might 
be expected, some products were associated 
with no Personality trait; others were associ- 
ated with one or more; and one product, vita- 
mins, was associated with all four of the per- 
sonality traits. 

Table 2 shows those relationships indicat- 
ing the significance level. In addition it shows 
correlation ratios in parentheses to indicate 
the approximate strength of the relationships. 

The results clearly indicate that there is a 
relationship between product use and person- 
ality traits. This relationship apparently may 
include both frequency of use of a particu- 
lar product and preference among different 
brands of a single product, since preference in 
automobiles is significantly. related to SCOres 
on responsibility. At the same time, some 
products are used frequently or infrequently 
without relationship to any of the personality 
traits tested. Each personality trait seems to 
bear a relationship to the use of Some prod- 
ucts, each of the four traits scored by the 
Gordon Personal Profile relating to the use of 
at least two of the products considered in the 
present experiment. 

It should be pointed out, however, that the 
relationships located between product use and 
personality are not particularly Strong, cer- 
tainly less strong than popular marketing con- 
cepts of the day suggest. 

An obvious corollary to the conclusion that 
personality traits and product use are related 
is that the Gordon Personal Profile does iso- 
late personality traits related to behavioral 
differences. Further, an examination of the 
pattern of significant relationships shown in 
Table 2 is persuasive that the four traits, 
ascendency, responsibility, emotional stability, 
and sociability, have considerable independ- 
ence. The manual for the test indicates that 
the intercorrelations are generally low except 
for those between ascendency and sociability 
(.43) and between emotional stability and re- 


328 


W. T. Tucker and John J. Painter 


TABLE 2 
SIGNIFICANT PERSONALITY TRAITS IN THE USE OR PREFERENCE FOR SOME CONSUMER PRODUCTS 


Responsi- Emotional : 
Ascendency bility Stability Sociability 
Headache remedies —.05 — —.05 = 
(464)* (.320)¢ 
Acceptance of new fashions .01 — - Ol 
(.331)* (.566)* 
Vitamins —.05 —.01 —.01 —.05 
(.332)* (.297)e (.091)c (.272)* 
Cigarettes m — — 
Mouthwash — —.05 e = 
(.224)* 
Alcoholic drinks = —.01 — = 
(.362)* 
Deodorants — — — = 
Automobiles* — 01 = 
(.281)c 
Chewing gum> ` -= 05 01 -— 
(.295)* (.331)¢ 


Note.—In all cases except for the last two products, the sign indicates the nature 
to infrequent use of headache remedies, for instance, but with rapid acceptance 


,? Subjects who preferred the more popul: 
higher on the responsibility scale than thos 
^ While there is no significant difference 


when offered it by someone else, are significantly | 
* Correlation ratios, 


sponsibility (.46). Those correlations were 
considerably higher in the present experiment 
as shown in Table 3. The remaining correla- 
tions are quite low. It must be concluded that 
the Gordon Personal Profile does not meas- 
ure four independent characteristics but two 
independent sets of related characteristics. 
However, it seems that one of a set of related 
characteristics can still prove to have enough 
relative independence to be conceptually 
valuable. 

Most of the significant relationships be- 
tween product use and character traits located 
are intuitively acceptable. One would expect 
that high ascendency and high sociability 
would be related to the rapid acceptance of 


TABLE 3 
INTERCORRELATIONS AMONG PERSONALITY TRAITS 


Ascend- Respon- Emotional 
Traits ency sibility Stability 
Responsibility .058 
Emotional Stability .035 .695 
Sociability -708 .035 .086 


ar makes of car such as Buick, Dodge, Mercur 

who stated a preference for such sports c; 
in personality trait scores and the amoi 
lower than others in responsibility 


of the relationship. High ascendancy is related 
of new fashions. ated 

y, Ford, Chevrolet, and Plymouth rat 
ars as the Corvette or Thunderbird. aly 
unt of gum chewed, those who chew gum 0’ 
and emotional stability. 


new fashions, especially since ascendency }§ 
described largely as socit:teadership. On the 
other hand there seemsjob be no particula! 
reason for expecting all personality character- 
istics to be associated with the frequency ° 
use of vitamins, unless one conceives that per- 
sonality traits are most likely to affect behav- 
ior that society neither rewards nor punishes- 
The results cast some possible light on en 
nature of responsibility as a character tral 
It is related to avoidance of vitamins p 
mouthwash, preference for popular cars em 
moderate drinking or abstinence. Since thes? 
are all modal characteristics of the group T 
ing tested, a reasonably strong case might ly 
made for the fact that responsibility is close 
related to the acceptance of group norms- ai 
A comparison of the present results Lara 
those of Eysenck (1960) suggests that ur 
ciability on the Gordon Personal Profile | 
considerably different from extroversion, wi 
which it might seem related. Eysenck's rest 
showed a strong, significant correlation ate 
tween extroversion and heavy cigarette pr 
ing, while the present experiment did not City 
hint at such a relationship between sociab! 


Personality and Product Use 


and heavy smoking. It is possible that the dif- 
ference in age (Eysenck’s subjects were con- 
siderably older) or difference in nationality 
(Eysenck’s subjects were British) might ex- 
Plain this apparent contradiction. 


SUMMARY 


The answers to the Gordon Personal Profile 
and a disguised product use questionnaire by 
101 college of business students demonstrate 
that personality traits are often related to 
product use. Thirteen of a possible 36 such 
relations were significant at the .05 level or 
above. 

A corollary conclusion is that the Gordon 


329 


Personal Profile distinguishes personality traits 
related to behavioral differences, although the 
four traits are not "independent." 


REFERENCES 


Eysencr, H. J. Tarrant, Morrre, Woorr. Myra, & 
ExcLAND, L. Smoking and personality. Brit. med. 
J., 1960, 5184, 1456-1460. 

FERBER, R., & Wares, H. Motivation and marketing 
behavior. Homewood, Ill.: Irwin, 1958. 

Manrixgav, P. Motivation in advertising. New York: 
McGraw-Hill, 1957. 

NEWMAN, J. W. Motivation research and marketing 
management. Boston: Harvard Univer. Press, 1957, 

Scorr, E. M. Personality and movie preference, Psy- 
chol. Rep., 1957, 3, 17-18. 


(Received October 10, 1960) 


Journal of Applied Psychology. 
1961, Vol. 45, No. 5, 330-337 


FACTOR ANALYTIC DEFINITIONS OF VOCATIONAL 


MOTIVATION 


JOHN O. CRITES : 


State University of lowa 


Reflecting the increasing emphasis during 
the past decade upon the role of dynamic 
factors in occupational choice and vocational 
adjustment (Ginzberg, Ginsburg, Axelrad, & 
Herma, 1951; Roe, 1956), recent research 
on the measurement of vocational phenomena 
deals largely with the construction of instru- 
ments designed to assess such motivational 
variables as preferences for different occupa- 
tional activities and situations, evaluations of 
the desirability of various job components, 
and judgments about the relative appeal of 
tangible and intangible work incentives. 
Among the new measures of vocational moti- 
vation are the Occupational Attitude Rating 
Scales (OARS) developed by Hammond 
(1954, 1956, 1959), the Work Satisfaction 
Questionnaire (WSQ) devised by  Astin 
(1958), and the Job Incentive Rankings 
(JIR) constructed by Bendig and Stillman 
(1958). Based upon factor and cluster analy- 
ses of item ratings and rankings, these selí- 
report questionnaires yield scores on 11 voca- 
tional motivation scales: Materialistic, 
Competitive, Technical, and Humanitarian 
(OARS);  Managerial-Aggressive, Status- 
Need, Organization-Need, and Working Con- 
ditions? (WSQ); and, Achievement-Need, 
Intrinsic Job Interest, and Job Autonomy 
(JIR). Within the three inventories, due to 
their factorial composition, scores on one 
scale are essentially independent of those on 
another, the highest correlation being .28 
for the Materialistic and Competitive scales 
of the OARS (Hammond, 1956). 

The intercorrelations of the scales from the 
various inventories are unknown, but it seems 
reasonable that they are related and define 


1The author wishes to express his appreciation 
to Harold P. Bechtoldt for his generous assistance 
in processing and analyzing the data. 

? Although Astin (1958) assigns no name to the 
item cluster which constitutes this Scale, ior con- 
venience in discussion it is called "Working Con- 
ditions." 


similar variables. Both Hammond (1956) 
and Astin (1958) found differences between 
college major groups on the OARS and WSQ 
which indicate possible relationships between 
the scales of these questionnaires. For €x- 
ample, the Materialistic (OARS) and Mana- 
gerial-Aggressive (WSQ) scales differentiate 
business students from those in other cul 
ricula, and the Technical (OARS) and Ot 
ganization-Need (WSQ) scales distinguish 
between science and nonscience majors: 
Bendig and Stillman (1958) report no grouP 
comparisons, but the item content of i 
JIR scales suggests that they measure d 
tudes much like those assessed by the OAR 
and WSQ. To illustrate, the Materialis © 
(OARS) and Job Autonomy (JIR) sca a 
contain almost identical items (economi 
security” and “good job security”), and t í 
Managerial-Aggressive (WSQ) and Ane 
ment-Need (JIR) scales express quite sim! 
preferences and values (“A job where I e 
not work under instructions" and “Freedo i 
to assume responsibility”). There i$ ene 
evidence, then, that the scales of the seat 
developed vocational motivation pen 
are not independent and that they clus | 
into distinguishable groups. to 

The purpose of the present study Was | 


test for the existence of interrelationsbity 
among the OARS, WSQ, and JIR Sas form. 
to identify the groups which the scales educe 
More specifically, the objective was to T^. us 
the number of scales to a more parsim" ition* 
set and to provide factor analytic de 
of the dimensions of vocational ToU is 
The anticipated outcomes of the a och" 
were a clarification of the structure 9 mp?” 
tional motives, an evaluation of the Y 
sition of the inventory scales, and à 


sis for 
: Y * cts. 
the integration of theoretical constru 


PROCEDURE [a 


B pal 
Measuring Instruments. The Ocupa for cac 
tude Rating Scales contain 40 items 


330 


Definitions of Vocational Motivation 


of four scales, which a subject rates on a five-point 
Sale, The scale extends from “much liked," through 
indifferent," to “much disliked.” The scores for 
the scales are the unweighted totals of keyed items. 
ternal consistency estimates for the scales range 
from 66 (Materialistic) to .80 (Humanitarian) 
(Hammond, 1956). The Work Satisfaction Question- 
naire has 19 statements about different kinds of 
Work which a subject rates on a seven-point scale 
of desirability, Scale scores are the sums of the 
Tatings. Astin (1958) reports no reliability data for 
he Scales of the questionnaire. The Job Incentive 
“ankings are made with 8 verbal descriptions of 
ped incentives, which the subject orders according 
i their importance to him in choosing an occupa- 
Nt Scores are obtained by subtracting the ranks 
Pairs of job incentives which define the three 
cales, Bendig and Stillman (1958) present no find- 
| 55 on the reliability of the rankings. ` 
ubjects, The sample consisted of 300 subjects 
D a large undergraduate, introductory course in 
chology, All subjects had at least sophomore 


jus Standing. They took the OARS, WSQ, and 
R Voluntarily as part’ of the course and received 
the test ad- 


miditionat credit for participating in 
dg Istration, In the sample there were equal num- 
i Of males and females. Because preliminary 
E alysis of the data indicated possible sex differ- 


din: a 
R^ in intertest relationships, the male and female 
were compared for 


he ance-covariance matrices 
ogy geneity, The findings (t= 72, df = 131, 2? 
Supported the homogeneity of the matrices, 

mbined 


ho 
Ww 
Never, and males and females were co 


331 


for the factor analysis proper, which was based 
upon the entire sample of 300 subjects. 

Method. The 11 variables for the OARS, WSQ, 
and JIR were intercorrelated by the usual product- 
moment techniques, and the resulting matrix of 55 
coefficients was factor analyzed by the complete 
centroid method, after the hypothesis of inde- 
pendence among the inventory scales was rejected 
(R. Bargmann, 1957 unpublished). Four factors were 
iterated through two cycles by Bargmann’s pro- 
cedure, and Lawley’s (1951) maximum likelihood 
test of the independence of residual covariances was 
applied to determine whether additional common 
factors remained. Since the test was significant (x? 
= 23.49, df=10, p € 01), one more iteration was 
made, and nonsignificant residuals which ranged from 
— 048 to .046 were obtained. With analytic (obli- 
max) and graphical methods, the resulting five íac- 
tors were then rotated to an orthogonal simple struc- 
ture solution. 


RESULTS 


Table 1 presents the  intercorrelation 
matrix for the OARS, WSQ, and JIR scales, 
as well as the means and standard deviations 
for each variable. As compared with total 
possible score ranges, and in ratio to the 
means, the standard deviations for the scales 
indicate adequate variability for the purposes 
of correlational and factorial analyses. From 
inspections of the score distributions for all 


TABLE 1 


ĪNTERCORRELATIONS or OCCUPATIONAL ATTITUD! 


QuzsrioxNAIRE. (WSQ), 


E Rare Scares (OARS), Work SATISFACTION 
AND Jon INCENTIVE Rankrincs (JIR) VARIABLES FOR 
MALES AND FEMALES COMBINED 


qq (N = 300) 
5 6 a 8 9 10 M SD 
9 Variables 1 4 $ : 
ARs. 
i 4.80 2.48 
z Materialistic s 6.65 2.01 
' Competiti 2 
3, petitive 5.03 2.28 
4. Technical (| 03 “ " 5.37 2.58 
Sg; 'umanitarian -0$ 
S 32.16 4. 
Manageria. E ae uuu Él 
6. g Egressive — 15.97 
: ; S EINE: 97 4.80 
$ Onts-Need 26 ^ 3j 22 29 —10 28.70 4.85 
n & Banization-Need 16 S —08 —22 00 28 02 14.77 3.70 
R: tking Conditions 12, |o 
9. = —16 5 
P 11 16 —14 —09 9.85 3.16 
4 sic Job Interest | 7?» — Sides SR Ph c. =09))! 10: 
j Joh utonomy ig —14 —02 02 =I 0.47 3.22 


| Note ion coefficients 
*cimal points omitted for correlation c 


At .05 level of significance r —.15; at .01 level of significance r 2.23, 


John O. Crites 


TABLE 2 


FACTOR Loapincs oF OCCUPATIONAL ATTITUDE RATING Scares (OARS), WORK SATISFACTION 
QUESTIONNAIRE (WSQ), AND JOB INCENTIVE RANKINGS (JIR) VARIABLES 


(N = 300) 
Rotated Factors 
Variables Ie 
A B [o D E 

OARS: 7 
1. Materialistic 56 00 03 05 —05 51 

2. Competitive 03 —08 52 04 —07 54 

3. Technical —02 —03 00 70 01 54 

4. Humanitarian —04 —40 40 —01 —03 35 
WSQ: : 
5. Managerial-Aggressive —39 05 74 —07 34 59 

6. Status-Need 04 39 —04 —03 —05 36 

7. Organization-Need 03 04 44 35 48 51 

8. Working Conditions —03 50 02 02 30 30 

JIR: 

9. Achievement-Need —62 00 06 15 —07 51 

10. Intrinsic Job Interest —39 —16 15 02 05 32 

11. Job Autonomy 37 —05 —04 —05 51 36 


Note.— Decimal points omitted. 


Scales and the scatterplots of five randomly 
selected pairs of scales, the assumptions of 
normality and linearity of regression seem 
tenable. 

Table 2 lists the rotated factor loadings 
and communalities for the 11 variables. Fac- 
tors A and B have both positive and negative 
loadings of sufficient magnitude to make 
them bipolar. The highest absolute loadings 


*A 
MATERIAL 
SECURITY 


JoB 
FREEDOM 


WORKING 


*B 
p STATUS 


SOCIAL SERVICE. 


SOCIAL 
APPROVAL 


Fic. 1. Vocational motivation factors. 


are on Factors C and D and the lowest oP 
Factors B and E, with Factor A intermedi- 


ate. Six of the scales (Materialistic, Compet — 


tive, Technical, Status-Need, Achievement- 
Need, and Intrinsic Job Interest) load O” 
only one factor, whereas the other five (Hu 
manitarian, Managerial-Aggressive, omn | 
zation-Need, Working Conditions, and Jo 
Autonomy) relate to two or more factor 
With the exception of the negative fole ee 
Factor B, as defined by the Humanitaria 
scale, the factors are composites, tice 
formed by combinations of two or mor 
scales. 

Figure 1 presents graphically the 
factors and some of the scales which lo s 
two or more of them. The scales which oe 
late with only one factor are not show”, 
order to simplify the diagram, but sles 
would be within the larger circles. The ar ale; 
depict the total variance in a factor OT sees 
and the amount of overlap between CT 
represents the approximate percentage 
common variance. The connected circles 
Factors A and B indicate that they Tacto! 
polar. Factor D is superimposed upon in 
B, but the two factors are actually 
pendent. 


various 
on 


for 


de- 


| 


Definitions of Vocational Motivation 


DISCUSSION 


To facilitate the interpretation of the re- 
sults and to make explicit the bases for 
inferences about the meaning of the factors 
refer to Figure 1 and Table 3. In the latter 
the columns (factors) and rows (scales) 
correspond to those in Table 2, but the 
entries are the items from the various voca- 
tional motivation scales. For a given scale, 


333 


items were classified according to the factors 
which they seemed to define. Thus, reading 
down a column provides a summary of the 
item content for a factor and suggests some 
hypotheses about the variable it measures. 

The factors were named, as much as pos- 
sible, to indicate the dimension of vocational 
motivation they define, instead of the type 
of work which satisfies a particular need or 
value. Three general dimensions, each with 


TABLE 3 
OARS, WSQ, xp JIR Scare IrEMs CLASSIFIED BY VOCATIONAL Motivation FACTORS 
Factors 
A. Material Ww 
sepius dob | A Pereng Status | C. Social Approval — 
(Positive) 
Economic security 
Earn big money 
Regular salary 
Earn fat bonuses 
Lots of overtime 
Quick financial 
return 
= 
2. Competitive Be the “Top Man" 
Room for self-expres- 
sion 
Beable to assert self. 
Looked up to in the 
community 
Heavy responsibilities 
Provide leadership 
Make a name for self 
Recognition 
S — 
3. Techni Using math 
"eme Attention to de- 
tails 
Working with fig- 
ures 
A methodical ap- 
proach 
Working with the- 
ory 
Using hands and 
tools skillfully 
Working in pri- 
vacy 
Ts 
4, Humanitarian (Negative) | 
Being patient with " 
people Initiate group. action 
Hearing people's Influence people's 
troubles lives " 
Be of service to Improve social con- 
others ditions | 
Change people for | Right social wrongs 
better Help the unfortunate 
Help others help 
themselves 


334 


John O. 


Crites 


TABLE 3—(Continued) 


Factors 
Scales " 
A. Material > 
Security vs. Job B. Personal Status | C. Social Approval D. System E, Structure 
reedom " 
5. Managerial- (Negative) Directing, controlling, Work which keeps 


Aggressive 


A job where I do 
not work under 
instructions 
Work where the 
duties change 
frequently 


and planning activi- 

ties of others 

Work where I in- 
fluence opinions 
and attitudes of 
others 

A job where I work 
under stress 

A job where I ex- 
press my personal 
ideas and feelings 


me very busy all 
of the time 


6. Status-Need 


(Positive) 
Recognition from 
others 
Travel frequently 
High salary 
Live in large city 


7. Organization- 
Need 


Working closely along 
with others 

Working with people 
who have similar 
interests 


Make judgments 
according to 
measurable 
standards 

A job where I 
must attain ex- 
act standards 


—— 


Work which keep? 
me very busy ® 
of the time 

A job where 
work must 
performed 
cording to 
time schedule ! 


the 


ace 
a set 


8. Working Condi- 
tions 


(Positive) 

A job where I work 
indoors all of the 
time 

A job with no phys- 
ical activity 

Live in large city 


Work where the 


9. Achievement- 
Need 


(Negative) 
Freedom to as- 
sume responsi- 
bility 


10. Intrinsic Job 
Interest 


(Negative) 
Friendly fellow 
workers 


11. Job Autonomy 


(Positive) 
Good job security 


Note.—The Managerial-Aggressive and Organization-Need scales of the Work Satisfaction Questionnaire have the item 


which keeps me busy all of the time" in common. 


specific aspects, emerged. As discussed be- 
low, these were security, Status, and service: 


Factor A—Material Security vs. Job Free- 
dom. The loadings of the so-called Job 
Autonomy (JIR) scale and the Materialistic 
(OARS) scale define the positive pole of this 
factor as a motivation for the extrinsic re- 


duties do "ot 
change fre- 
quently 
Opportunity to 
learn new skills | 
Good job securi 
«work 
T 
RT z tioD? — 
wards rather than the intrinsic satista ted 
of work. The emphasis in the negative Peed 


upon those aspects of the Achievement SQ) 
(JIR) and Managerial-Aggressive 3 
scales which pertain to freedom in work, ies 
ticularly as expressed in varied job of 


s 
and minimal supervision. The two pole 


ccc rr 
E O — MH o — HÀ A— AA 
"——Hrooe — eU — ————-— — 0€ 


Definitions of Vocational Motivation 


this factor closely correspond to the extrinsic 
and intrinsic sources of job satisfaction sug- 
gested by Ginzberg et al. (1951). 

Factor B—Personal Status vs. Social Serv- 

ice. The positive pole of this factor, as de- 
fined by the Status-Need (WSQ) and Work- 
ing Conditions (WSQ) scales, reflects an 
egocentric need for the status which comes 
from a highly remunerative, prestigeful, 
white collar job. Unlike the status acquired 
through the admiration and acceptance of 
others, which Factor C (Social Approval) 
assesses, the essence of the need for personal 
status is affluence rather than influence. The 
negative pole is the familiar *helping others" 
motivation, but always on an intimate, per- 
sonal level, not through organized activities 
and large-scale programs. Evidently, there are 
two distinct modes for the expression of 
humanitarian motives: one is in face-to-face 
individual relationships, and the other is 
through group enterprises, as in Factor C 
Social Approval). 
Factor d Approval. Unidentified 
in previous work, but intriguing as a dimen- 
Sion of vocational motivation, this factor ap- 
Dears to represent a fusion of attitudes. in- 
herent in the Protestant Ethic and American 
Capitalism (Whyte, 1956). It reflects the 
appeal of those social rewards, €g., recog- 
Nition as a leader and community servant, 
Which stem from the initiation and imple- 
entation of political and service programs 
through aggressive executive action. P: 
Prototype of the person motivated by m 
cial approval,” perhaps because of guilt feel- 
‘gs about his material well-being, is the 
Successful businessman who organizes chari- 
les and establishes funds in the public in- 
€rest, 

Factor D—System. Best defined by the 
py chnical (OARS) scale, which ebd es 
b. &Spects of precision and skill € » ed 
nd the Organization-Need (W "p E 
wehievement. Need (JIR) scales, a ted 
Named “System” to identify uh em 

acteristics: order, planfulness, detai’, 
Perfectio i si lity. These char- 
cte, Lonism, and impersonallty. 


“tistics define an orientation which con- 


ra i tor 
k rs sharply with that reflected in Fac 


thay tucture), The difference is s ee 
ich exists between the “methodic: 


ch 


335 


man and the “mechanical” man: one creates 
plans, and the other carries them out. 

Factor E—Structure. Other designations, 
such as “Organizational Security,” which 
emphasize the importance of duties and tasks, 
routines, and schedules as means of anxiety 
reduction (Kates, 1950), also capture the 
meaning of the scales that form this factor. 
In this instance, as contrasted with Factor 
A (Material Security), the formality of an 
impersonal organizational structure, rather 
than the accumulation of worldly goods, 
serves to provide a source of security. 

These interpretations of the factors are 
necessarily only suggestive and tentative, but 
they seem meaningful and provide hypotheses 
for further research. 

With respect to the OARS, WSQ, and JIR 
scales, their intercorrelations and factorial 
loadings provide some insight into both their 
composition and potential usefulness: 

1. Occupational Attitude Rating Scales. 
The Materialistic and Technical scales are 
factorially pure and measure what their 
names imply, although they might better be 
identified by more psychological and motiva- 
tional terms, such as the needs for “Material 
Security” and “System.” The Competitive 
scale is also factorially simple, but it seems 
to measure less a need for “Personal Status,” 
as Hammond (1956) suggests, than a need 
for “social status,” as indicated by its load- 
ing on Factor C (Social Approval). Unlike 
the others, the Humanitarian scale is factori- 
ally complex: one part pertains to assisting 
people through empathic understanding, the 
other represents helping them through group 
action. 

2. Work Satisfaction Questionnaire, With 
the exception of the Status-Need scale, which 
assesses a desire for prestige and self-ag- 
grandizement, the scales in this inventory 
are nonspecific and nonunitary. The Mana- 
gerial-Aggressive scale breaks down into three 
distinct parts—job freedom and variety, con- 
trolling and directing, and rapid work pace— 
which support the appropriateness of its 
name. Similarly, the Organization-Need Scale 
subdivides into various components, but only 
the items which load on Factor E (Structure) 
are consistent with the scale's title. The 
Working Conditions scale, which Astin 


336 John O. 


(1958) refrained from naming, assesses es- 
sentially what its title connotes and has suffi- 
cient promise to recommend its use in fur- 
ther research. 

3. Job Incentive Rankings. The incentive 
rankings have two major shortcomings. First, 
the names of the scales are largely inap- 
propriate to express what they measure. The 
Achievement-Need scale pertains more to job 
autonomy than to "getting ahead." Likewise, 
the Intrinsic Job Interest scale concerns con- 
comitant work satisfactions rather than satis- 
factions derived from the work per se. Also, 
the Job Autonomy scale measures security 
needs, not the desire for freedom in work. 
And, second, the scales are too restricted in 
their coverage of the vocational motivation 
domain. They are not comprehensive enough 
to represent more than one or possibly two 
of the seven factors derived from the analy- 
sis. Probably the best set of scales to define 
all of the factors, for both males and females, 
since there are no sex differences, is the fol- 
lowing: Factor A—Materialistic (OARS) 
and Achievement-Need (JIR), Factor B— 
Status-Need (WSQ) and  Humaniatrian 
(OARS), Factor C—Managerial-Aggressive- 
(WSQ), Factor D—Technical (OARS), and 
Factor E—Organization-Need (WSQ). 

As far as theories of vocational motivation 
are concerned, they propose various drives, 
instincts, and needs as basic in explaining 
why individuals make differential choices of 
occupations and seek dissimilar satisfactions 
in work. Among the energizing states and 
conditions which supposedly elicit and pro- 
duce individual differences in vocational be- 
havior are the aggressive and destructive 
drives (Menninger, 1942), the mastery in- 
stinct (Hendrick, 1943a, 1943b), the work 
and pleasure orientations (Ginzberg et al., 
1951), and the physiological and self-actuali- 
zation needs (Roe, 1956). To assess the value 
of these constructs as definitions of independ- 
ent dimensions of vocational motivation it is 
necessary to translate them into operational 
terms and to determine their interrelation- 
ships. The OARS, WSQ, and JIR scales meas- 
ure a number of the variables specified in the 
theories, and their intercorrelations and fac- 
torial loadings reveal the extent to which 
the variables covary. The findings agree with 
the use of several theoretical constructs to 


Crites 


account for individual differences in voca- 
tional motivation: there is no one general 
vocational motivation factor. More specifi- 
cally, they suggest that the needs for se 
curity, status, and service are unique lr 
tional motives and that Ginzberg's theory x 
extrinsic, concomitant, and intrinsic work 
satisfactions is the most meaningful one as 4 
frame of reference for these concepts. 

In further research on the factors as defi- 
nitions of vocational motivation, one of the 
most important problems concerns the dem- 
onstration of their construct validity. The 
present study establishes the factors 85 
unique variables, and suggests hypotheses 
about their meaning, but it offers no inde- 
pendent evidence of what they measure. Two 
kinds of additional studies are needed: e 
on the relationship of the factors to nontes 
indices of motivation, and the other on b 
possible drive properties of the factors, ore 
as their efficacy in the facilitation of a W! 
variety of responses and in the learning K^ 
new responses (Brown, 1961). Once thet 
is support for the factors as measures ? 
motivation, rather than habits or response 
tendencies, research on their relationships E 
vocational interests (Darley & Hagen?» 
1955), occupational membership REM 
1956), and job satisfaction (Ginzberg et 2” 
1951) will be possible. 


SUMMARY 


Recent theoretical emphases upon the = 
lationship of motivational variables to a 
pational choice and vocational adjustme 
have resulted in the development of 4 
instruments designed to measure such Man 
tional motives as competition, achiever 0 
and independence. Since the various ae 
these newly constructed questionnaires juca- 
inventories either differentiate similar €C ani 
tional and vocational groups or have i 
parable item content, relationships Es 
them are suggested. The present study 
designed to test for the existence ie fac- 
interscale relationships and to formula rives: 
tor analytic definitions of vocational Palatio” 

From an 11-variable intercora jec 
matrix based upon test data for 300 S acto 
(150 males and 150 females), five 1^ je 
were extracted, following the CO 
centroid method. After rotation, the 


these 


factor? 


Sl 


Definitions of Vocational Motivation 


were identified as follows: A—Material Se- 
curity vs. Job Freedom, B—Personal Status 
vs. Social Service, C—Social Approval, D— 
System, and E—Structure. These factors ap- 
peared to provide a comprehensive survey of 
the dimensions of vocational motivation usu- 
ally mentioned in theories of occupational 
choice and vocational adjustment. Intrepreta- 
tions of the factors were made, and the im- 
plications of the findings for theory construc- 
tion were discussed. Some possible problems 
for future research on the construct validity 
of the factors and their relationships to other 
vocational variables were outlined. 


REFERENCES 


Astrix, A. W. Dimensions of work satisfaction in 
the occupational choices of college freshmen. J. 
appl. Psychol., 1958, 42, 187-190. . 

Brenpic, A, W., & Srmrwaw, Evcenta L. Dimen- 
sions of job incentives among college students. 
J. appl. Psychol, 1958, 42, 367-371. — z 

Brown, J. S. The motivation of behavior. New 
York: McGraw-Hill, 1961. ! F . 

Dartry, J. G, & Hacenan, Trea. Vocational in- 
lerest measurement. Minneapolis: Univer. Minne- 
Sota Press, 1955. 


337 


GiwzsERG, E., Ginssurc, S. W., Axetrap, S, & 
Herma, J. L. Occupational choice. New York: 
Columbia Univer. Press, 1951. 

Hammonp, Maryorre. Occupational attitude rating 
scales. Personnel guid. J., 1954, 32, 470—474. 

HAMMOND, Marjorie. Motives related to vocational 
choices of college freshmen. J. counsel. Psychol., 
1956, 3, 257-261. 

HAMMOND, MARJORIE. Attitudinal changes of “suc- 
cessful” students in a college of engineering. J. 
counsel. Psychol., 1959, 6, 69-71. 

HexprICk, I. The discussion of the “instinct to 
master.” Psychoanal. Quart., 1943, 12, 561-565. 
(a) 

Henprick, I. Work and the pleasure principle. 
Psychoanal. Quart., 1943, 12, 311-329. (b) 

Kartes, S. L. Rorschach responses related to voca- 
tional interests and job satisfaction. Psychol. 
Monogr., 1950, 64(3, Whole No. 309). 

Lawrey, D. N. The maximum likelihood method 
of estimating factor loadings. In G. Thomson 
(Ed.), The factorial analysis of human ability. 
Gth ed.) New York: Houghton Mifflin, 1951, 

MzxNrINGER, K. A. Work as sublimation, Bull, Men- 
ninger Clin., 1942, 6, 170-182. 

Ror, Anne. The psychology of occupations, New 
York: Wiley, 1956. 

Wnuvrr, W. H. The organization man. New York: 
Simon & Schuster, 1956. 


(Received November 25, 1960) 


Journal of Applied Psychology 
1961, Vol. 45, No. 5, 338-344 


INDIVIDUAL AND GROUP CORRELATES OF ATTITUDES 
TOWARD WORK-RELATED CHANGE' 


DON A. TRUMBO 


Kansas State University 


This report is one in a series describing a 
set of studies on technological change. An 
earlier report (Jacobson, Trumbo, Cheek, & 
Nangle, 1959) described the research setting, 
sample, and data gathering procedures and 
presented selected findings regarding em- 
ployee perceptions of and attitudes toward a 
specific technological change event. 

In the present study, individual and group 
correlates of attitudes toward change were 
explored for the purpose of refining a number 
of hypotheses regarding those factors which 
condition employees attitudes toward work- 
related change. The term “work-related 
change” serves to distinguish the more gen- 
eral attitude object of the present study from 
the specific change event, reactions to which 
were described in the initial report. 

Prior research on response to industrial 
change has focused on the work group as the 
unit of study. Resistance to change has been 
defined in terms of group performance meas- 
ures and turnover rates (Bavelas, 1947; 
Coch & French, 1948). The effectiveness of 
“group participation” methods of adminis- 
tering change, as indicated in the results of 
these studies, strongly suggests that the lead- 
ership attitudes and behavior of the super- 
visor, in his role as change agent, may con- 
dition group attitudes toward change. 

While these studies provide insight into 
the factors which facilitate the acceptance of 
specific change events, no research known to 
this writer has systematically investigated 
the attitudes of employees toward change as 


1 This paper is based on portions of the writer’s 
doctoral dissertation (Trumbo, 1958). The writer 
wishes to acknowledge the guidance and assistance 
of James S. Karslake, Major professor, and of 
Eugene Jacobson, Milton Rokeach, and Donald M. 
Johnson. This study was conducted as one of a 
series of projects on technological change supported 
by the Labor and Industrial Relations Center, 
Michigan State University, under the research di- 
rectorship of Jack Stieber. Jacobson served as 
project director. 


a general work-related phenomenon. Neither 
correlates of individual change attitudes nor 
situational factors asociated with differences 
among attitudes in work groups have been 
identified. 

In the present study it was assumed that 
attitudes toward change could be meaning 
fully related to personal data items and other 
indices of employee needs and abilities, and 
that from these relationships more specific 
hypotheses about the underlying need struc- 
ture and conditioners of change attitude 
could be derived. It was also assumed that 
the social psychological climate of the work 
group, as reflected in measures of supervisor $ 
attitudes and group cohesiveness, conditions 
the change attitudes of the group. 

At the level of the individual employee 
the following general hypotheses were stated: 


Attitudes toward change are positively re- 
lated to the job investment of the employee 
Job investment was defined in terms of the 
employee's identification with the work ae 
the economic need for the job, and length PA 
service with the company. In addition M 
age and sex of the employee were felt to ^ 
indicative of differences in job investmen 7 
with the older employees having a greate 
investment than younger employees and bor 
having a greater investment than bu 
Therefore, the personal data items of es 
sex, length of service, and a job involvemt 
(JI) index were considered together aS 1 
dices of job investment. The prediction ° 
positive relationship between these items a 
attitudes toward change was based on ie 
assumption that job changes necessitate ti 
adjustments which would have more nega", ^ 
valence for the employees with Jittle 5 ; 
investment than for those with a greater 
vestment in their jobs. atti- 

Secondly, it was hypothesized that ae 
tudes toward change are positively "° nge 
to the capacity for adjustment to © ii ms 
Capacity for adjustment was defined in t 


338 


SSS eee ee 


Individual and Group Attitudes to Change 


of general ability, amount of formal educa- 
tion, and freedom from job anxiety. 

At the work group level, relationships were 
predicted between three measures of super- 
visory attitudes and group attitudes toward 
change. In view of the supervisor's role as 
administrator of change and in light of the 
evidence for the effectiveness of group par- 
ticipation methods of administering change, 
positive relationships were predicted (a) be- 
tween the change attitudes of the supervisor 
and group attitudes toward change, and (5) 
between the human relations attitudes of the 
supervisor and the change attitudes of the 
group. Prior research also provided a rationale 
for predicting a negative relationship between 
the authoritarianism of the supervisor and 
group change scores. 

Finally, reasoning that change may present 
a threat to the satisfaction and security ob- 
tained in cohesive work groups, it was pre- 
dicted that cohesiveness would be negatively 
related to attitudes toward change in the 
Work group. 

Since there is little in the way of formal 
theory regarding the phenomenology of work- 
related change, these predictions were tenta- 
tive and the study largely an exploratory 
One, 

PROCEDURES 

i from questionnaires and 
from the i n n on 46 supervisory e 
232 nonsupervisory personnel, approximately 857% 
°f the home office staff of an insurance company 
E ur epis els range: 23 to 42 
Sats), sand a median length of service we, s 
onths, Eighty-two percent were high school gra 
ates ws college graduates. 

» and an additional 10% were 


ipti ample, the 
Quester detailed description of the sampe, 


Í is 
ionnai ection procedures 
ayan 9nnaires, and data colli Pe tua 

l95o 


Data were 


dex of employee 
ge consisted of 
from the ques- 


Ae The job that you would consider ET a 
xj V ud be one where the way on an 
> (five choices from “is always the 
M great deal"). 

s aL the remaining eight jtems 
: "I strongly agree,” to 


had five alterna- 
“J strongly disa- 


339 


2. If I could do as I pleased, I would change 
the kind of work I do every few months. 

3. One can never feel at ease on a job where 
the ways of doing things are always being 
changed. 

4. The trouble with most jobs is that you just 
get used to doing things in one way and then 
they want you to do them differently. 

5. I would prefer to stay with a job that I 
know I can handle than to change to one where 
most things would be new to me. 

6. The trouble with many people is that when 
they find a job they can do well, they don't 
stick with it. 

7. Ilike a job where I know that I will be 
doing my work about the same way from one 
week to the next. 

8. When I get used to doing things in one 
way it is disturbing to have to change to a 
new method. 

9. It would take a sizeable raise in pay to get 
me to voluntarily transfer to another job. 
Responses to these items were coded serially 

from 1 to 5 with high item scores indicating favor- 
able change attitudes. Summated scores ranged from 
11 to 45 with a mean of 28.03 (c = 6.64). Compari- 
groups (Nu — N:— 61) from the composite score 
distribution yielded ? values from 3.5 to 13.5 with 
a median ¢ = 12.1 (p < .01 for each item). All items 
were monotonically related to total scores. The cor- 
rected odd-even reliability coefficient was .79. Hence, 
the items met the criteria for a Likert scale and were 
labeled the Change Scale. 

While the Change Scale had íace validity as a 
measure of generalized attitudes toward change, 
evidence of logical validity was sought in a com- 
parison of scale scores with responses to questions 
about specific past, current, and anticipated future 
change events. If the scale were a valid measure 
of attitudes toward change, high scores would be 
associated with favorable responses when much 
change was perceived and unfavorable responses 
when little or no change was perceived. Conversely, 
low Change Scale scores would be predictive of 
unfavorable responses when much change was per- 
ceived and íavorable responses when little or no 
change was perceived. 

Four pairs of Likert-type items in the question- 
naires permitted a test of these predictions. The 
first item in each pair indicated the amount of 
change perceived ("Have machines changed the 
nature of your job in the past two years?” “What 
effect did the change-over [to the new computer] 
have on your job?" “In general, how much change 
takes place from time to time in the way you do 
your present job?" and "Do you think the computer 
will affect your job in the next year or two?"), 
Each of these items was followed by the same five- 
choice item: *How do (did) you feel about this?" 
Alternatives to this item ranged from “like it very 
much" to "dislike it very much." 

Employees were dichotomized into “high change” 
and “low change” groups on the basis of responses 


340 Don A. 


first item in each pair. These groups were 
deg subdivided into “like” (L), “indifferent” (D, 
and “dislike” (D) categories by their responses to 
the second item in each pair. Differences among 
the mean Change Scale scores for the L, I, and 
D subgroups of the “high change” employees were 
significant and in the predicted direction for each 
of the four items. For “low change” employees, 
the differences were consistently in the predicted di- 
rection, but significant for one item only. 

These comparisons indicate that Change Scale 
scores were predictive of attitudes toward specific 
change situations, particularly when the employee 
perceived or anticipated relatively extensive changes 
in his own job. , 

Job investment. Data on length of service, age, 
and sex were obtained from the questionnaires and 
from the personnel files, The JI index was a com- 
posite score based on two questionnaire items 
dealing with future identification with the work 
force (“What do you hope to be doing five years 
from now?” and “What do you really expect to be 
doing five years from now?”) and two items indi- 
cating the economic significance of the job (“Are 
you the main wage earner in your household?” 
and “Could your household live adequately if you 
were not working?"), Item responses were dichoto- 
mized and scored “0” (low involvement) or “p” 
(high involvement). Intercorrelations among the 
four items ranged from .38 to .89. 

Capacity for job adjustment. Indices of the em- 
ployee's capacity for adjustment to job changes 
included Personnel Test scores (Wonderlic & Hov- 
land, 1939), years of formal education, and a single 
item job anxiety index. Personnel Test Scores were 
available from the personnel files for 181 of the 
232 employees. Job anxiety was indicated by re- 
sponses to the item: “Do things here at work 
ever make you feel "jumpy' or nervous?" 
alternatives from “never” to “very often”). 

Supervisory attitudes, The Change Scale was in- 
cluded in the supervisors’ form of the questionnaire, 
together with the Leadership Inventory (Nelson, 
1949) and an abbreviated (11-item) form of the 
Dogmatism Scale (Rokeach, 1956). Scores on these 
three scales served as measures of supervisory atti- 
tudes toward change, toward human relations prac- 
tices, and toward authority, respectively. 

The Leadership Inventory consists of 50 two- 
choice items with alternatives designed to reflect 
four idealized patterns of leadership behavior 
(Autocratic, Bureaucratic, Idiocratic, and Demo- 
cratic), Item stems are presented twice, once with 
A and D alternatives, and again with B and I 
alternatives. In the present study, items were com- 
bined into 25 four-choice items. A Human Relations 
score was obtained by summing the number of 
Idiocratic and Democratic alternatives chosen. 

The 11 items for the abbreviated Dogmatism 


(D) Scale were selected On the basis of 


item 
analysis data presented elsewhere (Rokeach, 1956). 
The D scale was developed by Rokeach (1956) as 
a measure of general authoritarianism, Conceptu- 


(six 


Trumbo 


ally, dogmatism is, in part, an attitude tawa 
authority. High dogmatic individuals are Eu. 
of as being more accepting or rejecting x ra 
depending upon whether they agree or =e 
with their own beliefs. The dogmatic Peon 
is at the same time aggressive toward peers ? zn 
subordinates and submissive toward persons in Pi 
sitions of authority over him (Trumbo, Rok 
& Gladin, 1957). Such attitudes toward mito 
and interpersonal relations may be important 


H A the 
ditioners for the behavior and attitudes of 
subordinates. 


Group cohesiveness. The five-item index of ero 
cohesiveness used by Seashore (1954) was Mx E: 
in the questionnaire. These items were designe ives 
measure the extent to which an individual perce! in 
himself as belonging to a group, desires to eT. 
in the group, and perceives his group to be supe 
to others. 


" a rod- 
The predicted relationships were tested by P 
uct-moment correlations or by tests of differe 


s t 
between means, depending on the nature 0! 
available data, 


RESULTS 


Attitudes toward Change and Indices of Job 
Investment : 
Data on age, sex, length of service. 4” 
Scores on the JI ind x were obtained d 
nearly all of the 232 nonsupervisory emp 
ees. Slight variations in AN resulted ie 
questionnaire items were omitted or pessum 
records were incomplete. The results are 5! 
marized in Table 1. sable 
Sex. Change Scale scores were ava The 
for 47 male and 178 female employees. me 
mean score for men was 31.4 and for bes a 
27.1. The difference of 4.3 points yie 
t value of 3.98 which was significant 
001). "m 
dn Information on date of esed n 
recorded for 217 of the 232 employer jus 
the basis of the age distribution, gon : 
were grouped into six age level ca ge 
A mean Change Scale score was CO differ 
for each age group. An F test of iod nt 
ences among these means was not S!£ 
(F = 141 with 5/211 df; p > Dh. 
Length of service. As indicated ; 
1, data were available on length o P e 
with the company for 209 employe rct 
cause of a rather high turnover rate P yt 
larly among the younger female en = wa 
the distribution of length of serv 
highly skewed. Consequently, the : 
tion was dichotomized at the media 


Individual and Group Attitudes to Change 341 


TABLE 1 
SUMMARY OF CORRELATES OF INDIVIDUAL 
CHANGE SCALE SCORES 


Variable Group 


Sex M 47 
F | 185| 27.1 


17-20 31| 
58 
28 
36 
34 
30 
+15 months | 112 
—15 months| 97 


Age (in years) 


Length of Service 


Job Involvement 229) 


m | = .28%* 
Personnel Test m | r=.28 


Education: | 

Less than High | 

School | 
High School 

Less than College | 

| 

| 

| 


= 6.04" 
College F 26.94 
~, 


| 220 r=—.17 


Job Anxiety 


** p <.01, 


Months into long- and short-term groups. 
9mparison of the mean Change Scale scores 
9r these two groups failed to indicate any 
Tilference in attitudes toward change (t= 
` 6 . 

Job involvement. The JI index yielded 
Ye score intervals from 0 to 4. Scores for 
ap employees were essentially rectangularly 
tistributed over the five score intervals, 
Qerefore, to test for relationship, a mean 

ange Scale score was computed for em- 
of ees at each JI score interval. An F tst 
Sip, Variance among these means was no’ 

Wlan (F = 1.36 with 4/224 df; p> 
he direction of the relationship be- 
i JI and Change Scale scores was posis 
mg, "OWever, as indicated by a procu 

“nt correlation of .13 (75). 


cilities toward Change and Indices of the 

"Ly of Job Adjustment 

Page Of the indices of the employer's o 

Or adjustment to changes in the Jo! 
to be predictive of change atti- 


tudes. The results are presented in the lower 
portion of Table 1. 

General ability. A product-moment corre- 
lation of .28 between scores on the Wonderlic 
Personnel Test and the Change Scale was 
significantly different from zero (with 179 
dj, p > 001). 

Amount of education. Data on years of 
formal education were obtained for 223 em- 
ployees, who were subsegently grouped at 
four levels of education: less than high school 
(N — 17), high school (N = 126), some 
college (W = 57) and college and graduate 
work (JV = 23). An F test of the differences 
among the mean Change Scale scores of these 
four groups yielded a significant F of 6.94 
(with 3/219 df, p — .01). Differences were 
consistent in direction. indicating a positive, 
monotonic relation between amount of edu- 
cation and attitudes toward change. The 
mean for the college and graduate group 
(33.8) was significantly different from each 
of the other means, but none of the remaining 
differences was significant. 

Job anxicty. Responses to the single item 
index of job anxiety (“Do things here at 
work ever make you feel ‘jumpy’ or ner- 
vous?") were distributed as follows: never 
(N = 18), very seldom (N = 49), seldom 
(N = 54), sometimes (N = 91), quite often 
(N = 14), very often (N — 3). Scores were 
assigned serially from 1 to 6 to these re- 
sponse categories. A product-moment correla- 
tion between these scores and Change Scale 
scores yielded a coefficient of — .17 which 
was significantly different from zero (with 
227 dj, p < .01). 


Except for differences between men and women, 
the four variables (sex, age, length of service, and 
job involvement) assumed to be indicative of dii- 
ferences in the job investment of the employee 
failed to be related significantly to attitudes toward 
change. Consequently, little Support is provided 
for the prediction that the employee with greater job 
investment would have more favorable attitudes to- 
ward change. On the other hand, less favorable 
attitudes among the female employees may indicate 
that change is perceived as threatening to social 
aspects of the job, which women rate as relatively 
more important than do men (Herzberg, Mausner, 
Peterson, & Capwell, 1957), Support for this jie 
terpretation was found in the analysis of situa- 
tional factors presented below, 

Indices of general ability, amount of education, 


342 


and freedom from job worries were each found to 
be predictive of more íavorable attitudes toward 
change. These variables appear to reflect three as- 
pects of the employee's capacity to adjust to 
changes in his work. The results consistently sup- 
port the prediction that attitudes toward change 
are related to the capacity to adjust to change 
situations. 


Attitudes toward Change in the Work Group 


Of the 232 nonsupervisory employees, 167 
were identified with 21 work groups from 
each of which three or more employees had 
completed the questionnaires. The work 
groups, defined in terms of common first 
level supervision, ranged in size from 3 to 27 
with a median of four members represented 
in the questionnaire data. 

Basic to the decision to study change 
attitudes at the group level was the assump- 
tion that such attitudes are associated with 
group membership. To test this assumption, 
a simple analysis of variance was performed 
on the Change Scale scores of the 21 work 
groups. The resulting F value of 2.64 was 
Significant (with 20/146 df, p < .01), indi- 
cating that employees within work groups 
are relatively homogeneous with respect to 
attitudes toward change. A Bartlett's test of 
homogeneity of variance was not significant 
(x? = 3.77), supporting the conclusion that 
the significant between-group variance was 
indicative of differences among the group 
means, 

These results demonstrate both that change 
attitudes are associated with group member- 
ship and that groups can be meaningfully 
described in terms of attitudes toward 
change. Consequently, Group Change scores 
were computed for the 21 groups as the mean 
of the Change Scale scores of group members. 

Group cohesiveness and group change 
scores. A Cohesiveness score was obtained 
for each of the 21 groups by averaging the 
Scores of group members on the five-item 
index. An F test of the between group vari- 
ance yielded an F of 3.91 (with 20/146 df, 
p < 01), confirming Seashore’s (1954) find- 
ing that cohesiveness is a meaningful dimen- 
sion when applied to industrial work groups, 
and supporting the decision to use the mean 
Scores as measures of group cohesiveness. 

The predicted negative relationship þe- 


Don A. Trumbo 


tween Cohesiveness and Group Change scores 
was tested by computing a product-moment 
correlation between the two sets of Scores. 
The results, in Table 2, were consistent with 
the prediction. The coefficient of — .50 was 
significantly different from zero (with 20 df, 
p<.0l). 

Supervisors’ attitudes and group attitudes 
toward change. Results of the correlations 
between three indices of supervisors’ atti- 
tudes and Group Change scores are presented 
in Table 2. 

The Human Relations (HR) scores from 
the Leadership Inventory provided a logically 
valid index of employee-oriented supervisory 
attitudes. Since, as indicated earlier, a soci? 
climate conducive to group participation an 
decision making appears to reduce overt 1€- 
sistance to change, it was predicted that 
scores would be positively related to Grey 
Change scores. The results failed to suppor 
this prediction, however. The coefficient of Mt 
obtained between these two sets of scores W4 
obviously not significant. d 

The supervisors’ Change Scale scores Serves 
as the second measure of supervisory a 
tudes. Assuming the supervisor’s role as ? i 
administrator of change to be an importe 
conditioner of group change attitudes, a po 5 
tive relationship was expected between Chane 
Scale scores of the supervisor and mod a 
the group. The coefficient of .41 between t A A 
two sets of scores supports this predict! 
(with 20 df, p < .05). 


TABLE 2 


ANGE 
SUMMARY or CORRELATES OF Group CHAN 
SCORES AND THEIR INTERCORRELATIONS 


(N = 21 groups) 


Relations tis 
ati 
sete ‘score Doster 
(super- _Cohe- (super- visor? 
visors) siveness visors) 
Change Scale E 
(group) a1* — —.50** .04 
Change Scale ol 
(supervisors) —.65** —A2 "20 
Cohesiveness —48 
Human relations 
score (super- a2 
visors) 


* 5 <.05 (two-tailed test). 
** p <.01 (two-tailed tests). 


Individual and Group Attitudes to Change 


| The final measure of supervisors! attitudes 
Was the abbreviated Dogmatism (D) Scale. 
| As indicated earlier, a. negative relationship 
| was predicted between supervisors’ Dogma- 
tism scores and Group Change scores. The re- 
Sults are consistent with this prediction. The 
Correlation between these two sets of scores 
, Was — .38, which is significantly different from 
zero (p « .05 with 20 df). 


DiscussioN 


Failure to find a significant relationship be- 
tween the human relations attitudes of the 
supervisors and group attitudes toward change 
was rather unexpected in view of the popular 
belief about the effectiveness of "employee 
Oriented" supervision. One explanation of this 
lack of relationship may be found in the re- 
Stricted range of human relations attitudes 
among the supervisors. In this connection, it 
Was found that the standard deviation of HR 
Scores of the 21 supervisors in the group analy- 
Sis was considerably, though not significantly, 
less than that of the population of 46 super- 
Visors from which they were drawn (o,21) = 

79, whereas 0(40) = 11.62). . 

Secondly, it is possible that, while human 
relations attitudes of the supervisor condition 

stain attitudes of their subordinates, it 
should not be assumed that they condition all 

°rk-related attitudes. However, no support 

T this explanation was found. HR scores 

Ste not significantly related to either group 
n esiveness scores or summated scores on a 

Item job satisfaction check list. . 
neg, © Negative relationship between Dre. 

| that and Group Change scores may indic 
tc i i urce of threat 
to ing ange is perceived as a so aa 
| ti, Orma] group structure and to the sa ew 
Woy, Social needs provided for in «rea ua 
Ris, StOup. This interpretation, whic ble 
atti, arlier to account for the less nes n 
Boa female workers, appears 
ed by both findings. 
tom Ore adequate understanding of the phe- 
hay ology of industrial change requires an 
Shan Sis or employee needs in relationship to 
| ting Attitudes and of the relationship of m 
NN A dimension to other Trage an is 
“baby *mployee's response to C m 
y conditioned by his perception 0 


43 


Q3 


way in which the effects of change relate to 
his needs. If change as a general phenomenon 
is to be accepted, its effects must be perceived 
as generally more rewarding than unreward- 
ing, that is, they must provide need satisfac- 
tion. 

What are the effects of change which may 
condition change attitudes? The most obvious 
effect is that change provides variety in the 
job routine. Thus, favorable attitudes toward 
change may reflect a need for variety or 
avoidance of repetition. Change may also en- 
hance the opportunities for promotion thereby 
providing satisfaction for status needs. Fi- 
nally, skill requirements and responsibility de- 
mands may be increased as a result of change, 
thereby providing greater challenge and per- 
haps greater self-expression in one’s work. 

These speculations about the need structure 
underlying attitudes toward change gain some 
support from the following evidence: Positive 
relationships were found between Change 
Scale scores and the extent to which employ- 
ees liked reported increases in variety, skill 
requirements, responsibility demands, and 
chances for promotion in their work. These 
results suggest that employees may favor 
change because they perceive it as a means 
to greater variety, status, and self-expression 
at work. 

An alternative interpretation would be that 
readiness for change is simply a manifesta- 
tion of job dissatisfaction. That is, an em- 
ployee may welcome change because he is 
dissatished with his job in general or with 
specific aspects of the job, such as fellow em- 
ployees or supervision. Evidence bearing on 


this interpretation was negative, however. The 


correlation between Change Scale scores and 
summated scores on the 15-item Job Satisfac- 
tion (JS) check list was .07. Furthermore, 
upper and lower 2776 groups from the Change 
Scale distribution did not differ significantly 
on any of the 15 items of the JS check list. 

In review, the following tentative conclu- 
sions may be presented as hypotheses for 
further research. Employees may perceive 
change as a source of threat to social aspects 
of their job, as indicated both by the less 
favorable attitudes of females and by the 
negative correlation between group cohesive- 
ness and Group Change scores. Situational 


344 


factors, manifested in supervisory attitudes and 
group cohesiveness, appear to be important 
conditioners of change attitudes in the work 
group. Individual differences in attitudes to- 
ward change may reflect differences in the 
capacity to adjust to change situations. Fi- 
nally, readiness for change may be related to 
a complex of needs including variety, status, 
and selí-expression at work. 


SUMMARY 


This report presented some of the findings 
of a study of the correlates of employee atti- 
tudes toward change as a general job-related 
phenomenon. Questionnaire and personnel file 
data were obtained for 232 nonsupervisory 
and 46 supervisory personnel of a medium 
sized midwestern insurance company involved 
in office automation. Attitudes toward change 
were measured with a nine-item Change Scale 
included in the questionnaires. Evidence for 
the reliability and logical validity of the 
Scale were presented. 


The results may be summarized as follows: 


1. Female employees scored significantly 
lower on the Change Scale than males. How- 
ever, since none of the other indices of job 
investment (age, length of service, and Job 
Involvement index) was predictive of change 
attitudes, the sex difference was interpreted in 
terms of the differential importance assigned 
to various aspects of the job by men and 
women. Since women consistently rate social 
aspects of the job as relatively more important 
than do men, it was hypothesized that un- 
favorable attitudes toward change may reflect 
a perceived threat to informal social struc- 
ture. 

2. Change Scale scores were positively re- 
lated to Wonderlic Personnel Test Scores, 
amount of education, and freedom from job 
anxiety, supporting the view that change at- 
titudes are related to the capacity to adjust 
to changes. 

3. Attitudes toward Change were found to 
be associated with work group membership. 
Change Scale scores within groups being rela- 
tively more homogeneous than among groups. 

4. Group cohesiveness was negatively re- 
lated to Group Change scores. This finding 
gave support to the view that less favorable 
attitudes toward change may indicate that 


Don A. Trumbo 


change poses a threat to the satisfaction of 
social needs through informal social structure. 
5. Supervisors! attitudes toward change 
were positively related to Group Change 
scores, while supervisors! scores on a measure 
of authoritarianism were negatively related to 
Group Change scores. However, supervisors 
scores on an index of human relations atti- 
tudes were unrelated to attitudes of the group 
toward change. E 
6. Among employees who perceived in 
creases during the preceding year in variety; 
skill and responsibility demands, and chances 
for promotion, approval of these increases was 
associated with higher Change Scale score? 
than indifference or disapproval. This evident 
provided tentative support for the view tha 
readiness for change is related to employee 


: ; on 
needs for variety, status, and self-express! 
at work. 


REFERENCES 


BavELas, A. Cited by K. Lewin. Group decision and 
: : 9 Hartley 
social change. In T. M. Newcomb & E. L. p 
(Eds.), Readings in social psychology. New Yo 
Holt, 1947. "E 

Cocu, L., & Frencn, J. R. P., Jr. Overcoming 37, 
sistance to change. Hum. Relat., 1948, 1, ue 

Herzperc, F., Mausner, B., Peterson, R^ & an 
WELL, D. Job attitudes: Review of research s of 
opinion. Pittsburgh, Pa.: Psychological Service 
Pittsburgh, 1957. 

Jacossow, E., TmuwmBo, D. CHEEK, G., & hang 
J. Employee attitudes toward technological € 7 
in a medium sized insurance company. 
Psychol, 1959, 43, 349-354. NT of 

Netsoy, C. W. The development and evaluat)’, 
a leadership attitude scale for foremen. 
lished doctoral dissertation, University of 
1949, ] tism: A 

Roxracn, M. Political and religious dogma "psy 
alternative to the authoritarian personalia - 
chol. Monogr., 1956, 70(18, Whole No. gui 

SEASHORE, S. E. Group cohesiveness in the Michiga™ 
work group. Ann Arbor: University of 
Institute for Social Research, 1954. towar? 

TnuMBo, D. A. An analysis of attitudes o com" 
change among the employees of an insura ichiga” 
pany. Unpublished doctoral dissertation, 

State University, 1958. 5 valid? 

Trumno, D, Roxracu, M., & Grabin, L- "A group? 
tion study with high and low dogmati ssoci^ 
Paper read at Midwestern Psychologica ne 
tion, Chicago, May 1957. rst 

Wonnertic, E. F, & Hovrawp, C. I. TE Ps d 
Test: A restandardized abridgment 9 _ apt 
S-A test for business and industrial use- 
Psychol., 1939, 23, 685-702. 


Q^ 


n| 
Chicas” 


(Received November 25, 1960) 


Journal of Applied Psychology 
1961, Vol. 45, No. 5, 345-350 


RELIABILITY, SEX DIFFERENCES, AND VALIDITY 


IN THE LEADERLESS 


GROUP DISCUSSION 


TECHNIQUE 


WALTER A. KAESS, SAM L. WITRYOL, ann RICHARD E. NOLAN 


University of 


Various technical, predictive, and theoreti- 
cal characteristics of the Leaderless Group 
Discussion (LGD) Technique in military, in- 
dustrial, and college settings have been re- 
ported recently (Bass, 1954; Borg & Tupes, 
1958; Gleason, 1957). The LGD is a minia- 
ture situation in which 6 to 10 subjects 
discuss a general problem for approximately 
an hour, at the end of which they present a 
group solution. The subjects are usually 
judged by trained observers instructed to rate 
Specified characteristics of leadership emerg- 
ing in the situation, but Gleason (1957) has 
Suggested an economical procedure in which 
the subjects rate each other. Our study pro- 
Vides a comparison of the reliability of three 
LGD judging techniques with analyses of the 
effects of sex composition of the judging, as 
Well as of the participant, groups. Personality 
Correlates of leadership were examined along 
the judging behavior of the leaders them- 

Elves, 
ü It would appear almost impossible to meet 
Standard conventions of objective testing In 

* LGD where the problem presented to the 
Stoup is intentionally ambiguous, where the 
Personal-social composition differs from group 
thors D and where observer — M: ek 

recording sheer talk to rating $ e 
Or specified characteristics. of leaders iip 
tinge 1954). Vet, rater reliability pem, 
Pape *pproached the limits for po B aen 
thoy, Pencil test, and stability zy is 
ass more variable, have been ec ^d 
Vielq ? 1954), As research has accum ED 
aq jure specifications gin E. 
lene tration for objectivity, Dre e has be- 
i and economy, the technique vestiga- 
.'"Creasingly popular among n f six 
igp 5, Bass concluded that groups oF Sit 
o Eht Baca -e strangers to each 
3 hep > Whose members are $ nic matena 
ielq Who discuss case history en 
"Omising results; Gleason’s ( uis 


Connecticut 


intersubject rating system dispenses with the 
necessity of trained observers. Consequences 
of these findings lead to logical, provocative 
conceptualizations of the LGD. Demonstrated 
validity and reliability estimates of the LGD 
combined with economical rating techniques 
constitute a good predictor instrument for 
leadership. The LGD remains, however, more 
cumbersome than simple paper-pencil tests 
which can be administered to large groups, but 
it can also be viewed as an intermediate cri- 
terion of leadership with better defined char- 
acteristics than widely used applied criteria. 
More economical scales could be validated 
against such a criterion before application to 
more expensive and complex criteria. 

Specifically the purposes of the present in- 
vestigation were to: (a) compare the degree 
of LGD judging agreement within and among 
three conditions of ranking defined as ob- 
servers, recorders, and participants; (5) 
analyze the influence of sex membership of 
the judges and of the participants; (c) evalu- 
ate the accuracy of leader and nonleader rank- 
ings; and (d) examine the relationships be- 
tween LGD leadership and selected factors on 
the Guilford-Zimmerman Temperament Sur- 
vey. 

PROCEDURE 

Subjects. Subjects were students from 21 laboratory 
sections in introductory psychology. The number of 
students in most lab sections was about 30; subjects 
were assigned to participant and observer or recorder 
roles by a random method. 

LGD Participants. Typicaly two groups of six 
subjects each were seated in separate circular ar- 
rangements at opposite ends of the room. Group A 
was all male, and Group B contained mixed sexes, 
three males and three females. In a few of the 
smaller lab sections there was only one LGD group. 
Participants wore identifying tags, numbered from 
one to six. There were 19 Group A arrangements 
and 19 Group B arrangements, totaling 228 LGD 


articipants. 
p Observers and Recorders. Each group was observed 


by two pairs of nonparticipating students seated on 


346 


the periphery of the circle. Using bridge terminology, 
North and South were designated “observers”; 
East and West, “recorders”; each of the two pairs 
consisted of a male and a female. Recorders were 
provided with appropriate tabulation sheets to check 
every incidence of talking by each member of the 
LGD group. Observers were instructed to fill out a 
five-point rating form employed in the Bass studies 
(Bass & Norton, 1951) at the end of the observa- 
tion period. This rating scale contains 18 items in- 
volving initiation of behavior, influencing others, 
consideration for others, and authoritativeness. Ob- 
servers and recorders, 76 males and 76 females, were 
selected from their lab sections. 

Task Presentation. The LGD groups listened to a 
short, 2-minute tape recording of a playlet concern- 
ing a coed's indecision about remaining in college. 
There was no obvious, pat solution. The participants 
in each group discussed the coed's problem 20 
minutes, and then each participant wrote a solution. 
They next ranked each member on leadership, as 
did the observers and recorders, according to the 
following instructions: 


Place the number of the discussant who in your 
opinion was the leader of the group opposite 
Number 1; then list opposite Number 6 the num- 
ber of the person who was lowest in leadership. 
Then opposite Number 2 list the number of the 
discussant next highest in leadership; opposite 5, 
the number of the person second lowest in leader- 
ship. Continue until all people have been ranked, 


It should be noted that three independent leader- 
ship rankings were available from observers, re- 
corders, and participants. The experimental design 
provided data relevant to sex differences in judging 
LGD behavior, as well as in the emergence of leader- 
ship. Scores on the first five scales of the Guilford- 
Zimmerman Temperament Survey, administered 10 
weeks prior to the LGD experiment, were available 
on most subjects for analysis of personality corre- 


lates. 
RESULTS 
Reliability 


The measures of reliability employed in this 
study might be classified the “degree of agree- 
ment” among judges, with analyses including 
the effects of the sex of the judges, the sex 
of the LGD participants, and the ranking con- 
dition, Two of the ranking conditions, observ- 
ing and recording, were relatively passive in 
contrast to active LGD participation which 
led to a type of sociometric evaluation. 

Within Conditions of Ranking. The degree 
of agreement was computed between observer 
pairs, between recorder pairs, and among par- 
ticipants. Spearman’s rho coefficient was used 
to express agreement between each of the 38 


W. A. Kaess, S. L. Witryol, and R. E. Nolan 


observer, and between each of the 38 recorder, 
pairs. Kendall’s coefficient of concordance, W, 
was particularly appropriate to express the 
degree of agreement among the six partici- 
pants in each LGD group since it constitutes 
the mean of all possible rho correlations; thus, 
the unit of agreement was essentially the same 
for all types of observation. Table 1 presents 
rho and W coefficients in quartile values for 
the observer, recorder, and participant groups: 
the range of the median coefficients is .61-.92. 
If, as is usually the case, the ratings were 
pooled over a condition of ranking or evalua- 
tion, the reliabilities of the sums would be 
considerably higher; this would be particu 
larly apparent for the participant sums whic 

would be based on six rankings. These COT 
servative reliability estimates, then, appear 
substantial, and they are comparable over 
conditions of ranking as demonstrated 1” 
Table 1, and as analyzed below. 

Between Conditions of Ranking. After find- 
ing the high agreement within ranking condi- 
tions, it was important to determine the extent 
of agreement among the ranking conditions 
in order to obtain further clarification O” 
which, if any, might be the most adequate 
measure of LGD performance. The three con 
ditions of assessment represented by observe 
recorder, and participant rankings, respec 
tively, were compared. For each LGD group 
of six, the two observer rankings were poole j 


5 n 
the two recorder rankings were pooled; Rho 
the six participant rankings were pooled. 

TABLE 1 
QUARTILE VALUES FOR RHO AND W 
RELIABILITIES Ep — 
Raters 
G Rated Q . inant’ 
"LR Observers Recorders Particip 
rho rho 
Ae 75| 93 97 T 
50 84 89 n 
25 .59 .65 3 
ph 75| 9 98 E 
= 50 TE 92 “AT 
25 49 79 f 


NAUES 
* N —6 in each of 19 same sex Group A combinati 


325 inations 
? N —6 in each of 19 mixed sex Group B combina! 


Leaderless Group Discussion Technique 


I 
He 
-, 


TABLE 2 


MEDIAN RHO VALUES BETWEEN CONDITIONS OF RANKING 


Group A 
(19 same sex sextets) 


Condition Combinations 


Observer-Recorder 
Observer-Participant 
Recorder-Participant 


Group B 
(19 mixed sex sextets) 
93 86 
93 84 
94 .90 


correlations were then computed for the 38 
groups between all combinations of the three 
conditions; these results are summarized in 
the median coefficients presented in Table 2. 
The magnitudes of these median intercorrela- 
tions, ranging from .84 to .94, suggest that 
within a given LGD group the three condi- 
tions of ranking yield comparable results. 
These findings, moreover, verify the substan- 
tial reliability of the LGD technique as 
employed in our study. The base measures 
Used in our subsequent analyses were the 
Pooled participant composite rankings be- 
Cause: (a) they can be easily and economi- 
cally obtained in an LGD situation, (b) 
they are based upon rankings by the largest 
Number of judges, and (c) of the convenient 
tautology that the leader selected by the 
Participants is by one definition the leader 
of the group. » 
Interactions between Sex and Conditions. of 
?hking, In this analysis of variance, having 
“lected the pooled participant composite as 
"t criterion measure, we considered: (a) 
Y0 social conditions (19 same sex and 19 
fixed sex LGD groups of six each), (b) two 
dit; s of observation (the remaining two eon 
er ‘Ons of ranking by observers and recor 


» and (c) sex within the conditions of 


i i le ob- 
sep ation (one male and one female ol 
female recorder in 


ea^ One male and one e re 
We E 38 groups). Using our criterion ep 
funa, € examined accuracy of observation E 
Sery, lon of ranking condition, Sex of the 0 > 
tion > and interaction with the sex compos 
the LGD group. The sum of the differ- 
tinki dared between the criterion and the 
ard a of an individual observer OY d 
Seis Was the unit of measurement for eac 
We ,PODdition in a double analysis of vari- 
"Nati * two main effects, conditions of ob- 
7^ and sex of the observer, did not 


yield significance, nor did the interactions 
with social composition. The findings did not 
support expectations based on our research in 
social intelligence (Witryol & Kaess, 1957) 
that females would be more perceptive of so- 
cial behavior. 


Accuracy of Leader and Nonleader Judgments 


The accuracy of the leader's perception of 
the group as a significant requisite has been 
treated in a contradictory fashion in the re- 
search literature. Those investigators who deal 
with an empathy construct in leadership have 
emphasized social insight and perceptual ac- 
curacy, while other workers have found that 
personal characteristics like dominance and 
egocentricism, which may blunt accurate so- 
cial perception, are more relevant features. 
Our data provided a modest test of these 
incompatible positions. 

On the basis of the participant sum of 
scores received, those ranked highest in leader- 
ship were compared in social accuracy with 
those ranked lowest in each group. The cri- 
terion of group structure was defined by the 
rankings assigned via the pooled participant 
composite, the base measure indicated above. 
Social accuracy was then determined by the 
amount of agreement between an individual's 
judgments and the pooled composite. The 
sum of the differences squared between judg- 
ments by an individual of high or low leader- 
ship and the pooled composite judgments was 
the specific measure for expressing social ac- 
curacy of leaders and nonleaders. 

From the analysis of variance in Table 3 
it was inferred that nonleader judgments were 
more congruent with the criterion than were 
leader judgments at a very high statistical 
confidence level, beyond .001. In Group A 
the median rho for nonleader judgments cor- 


348 


TABLE 3 


ANALYSIS OF VARIANCE BY LEADERSHIP AND 
LGD Group CONDITIONS 


Conditions ms df F 
Leaders-Nonleaders 882.6 1 13.11* 
LGD Groups 99.6 1 1.48 
Interaction 165.6 1 2.46 
Within Groups 4848.0 72 


* Significant at .001 levei. 


related with the pooled composite was .89; 
for leaders, .79; in Group B the rho’s were 
.91 and .60, respectively (W = 19 in both 
groups). Perceptual accuracy of leadership 
did not yield statistical significance as a 
function of group conditions, nor were the 
interactions significant. In short, nonleaders 
were more accurate in perceiving LGD leader- 
ship status. Further analyses which we shall 
not report in detail here support this general 
finding. Thus, the leader was as likely to 
disagree with the composite in judging non- 
leaders, as in identifying leaders, reflecting 
lack of congruence with the criterion in vari- 
ous portions of the judging scale. Finally, 
leader judgments were in less agreement than 
nonleader ratings when both groups were com- 


pared with the observer and recorder com- 
posites. 


Personality of LGD Leaders 


The first five scales of the Guilford-Zim- 
merman Temperament Survey, administered 
10 weeks before the present investigation, 
were available on about 70% of the subjects. 
In Group A (male sextets) 26 leaders, defined 
as Ranks 1 and 2 on the pooled composite 
criterion and 27 nonleaders, defined as Ranks 
5 and 6, were compared on the GZ scale. 
Leaders were higher than nonleaders on 
General Activity (G), Ascendance (A), and 
Sociability (S) at .05, .01, and .05 confidence 
levels, respectively, as demonstrated in Table 
4. A discriminant function equation com- 
puted for Group A took the raw score form: 


D = 838 G+ 1.858 A+ 348 S 


The F of this discriminant function is 28.22 
for df = 3/46; the .01 confidence level, 6.60. 


W. A. Kaess, S. L. Witryol, and R. E. Nolan 


The mean discriminant scores for leaders and 
nonleaders were 53.93 and 41.74, respectively- 
Maximum accuracy of leadership classification 
would follow, then, from a D score of 47.84 
and above. 

When the discriminant equation was ap- 
plied to males in Group B (mixed sex sextets); 
seven of nine leaders and 12 of 17 nonleaders 
were correctly classified. The chi square of 
5.54, significant at the .02 level, suggests that 
leadership personality characteristics for males 
in one type of LGD group (same sex) are 
similar in the other type (mixed sex). This 
is further supported by an y of .45 between 
D scores and the pooled composite leadership 
scores for 34 males in Group B. " 

The discriminant function did not predict 
female leadership in the mixed sex L 
groups, nor were scores on the five GZ scales 
suggestive of leadership-nonleadership differ- 
ences. Personality characteristics appropriate 
to male leadership do not appear to be t d 
same as those in female leaders, in mixed 
groups, the only ones available for the female 
analysis. The sex distributions over the 9'* 
ranks may provide some clues. The polar 
Ranks 1, 5, and 6 were dominated by males: 
67%, 69%, and 61%, respectively; the inter 
mediate Ranks 2, 3, and 4 were dominated 
females: 70%, 58%, and 70%, respectively’ 
Females appear to exercise more subtle, E 
secondary, leadership when the sexes xt 
mixed. Those who actively competed for oa 
first position with males fall outside the pe 
ventional pattern. Obviously personality cha 


isti i : z si jon 
acteristics of females in a mixed sex situat 
TABLE 4 " 
LEADER AND NONLEADER GUILFORD-ZIMMERMA 
Scores FoR MALE GROUPS = 
Leaders | Nonleaders 
(N = 26) (N = 27) 1 
Scale OMNES 
M SD| M SD — 
z egi |228 
General Activity 18.00 6.32 | 14.12 5.81 i 
Restraint 14.02 4.71 | 15.07 540 NS 
Ascendance 17.32 4.82 | 13.28 5.53 225* 
Sociability 19.36 6.25 | 15.20 6.95 AAT 
Emotional Stability | 16.35 5.40 | 16.63 5.20 | 


* Significant at .05 level. 
** Significant at .01 level. 


Leaderless Group Discussion Technique 349 


are more complex in association with leader- 
ship under these circumstances. 


Discussion 


Our reliability findings are high and con- 
gruent with those reported in the definitive 
LGD review (Bass, 1954). We have also 
demonstrated substantial agreement among 
participants when they rank their own peers 
within a group, an analysis not previously 
found in the literature, and we have compared 
these rankings with two different external 
rating observations of LGD leadership. The 
strong agreement within and among these 
various conditions of observation are impres- 
sive when one considers the heterogeneous 
social possibilities inherent in 38 six-member 
groups where the members were more or less 
Strangers to each other and where 19 of the 
groups contained male and female subjects. 
In one group a leader might take command 
and be subsequently dethroned, in another 
no one had much to say, and in a third every- 
One talked at the same time. These casual 
Observations have been confirmed and sys- 
tematized in a small group study employing 

ale’s categories (Kirscht, Lodahl, & Haire, 

959) from which the investigators concluded 
that the leader is the one who has ideas where 

€y are lacking, or that the leader has ideas 
that are integrative and task relevant when 
‘alk abounds. Our findings strongly reflect the 

“curacy of LGD leadership determination 
cite the variability of group compos Hon 

“ational demands, and the special require 

ents of leadership. 
hYpot are tempted to spec 
Whic hetical predisposition to 
situati takes many forms as a fu - 
Areas" The LGD technique = 
Succ ch to leadership comparable 
Whe "Y in the measurement of in 
nate Cattell's atomism failed. Carr ge 
EN Y One step further, the persona = 
Sty, dict Small group leadership 50 on vat 
With v, My be similar to the qoe 
italy Ving degrees of success, of fa d 
MN 5 ior intellectual measurement tior 
Lorone nd in both areas the situation } 

h too often leads to a frustrating I? 
ess. None of these approaches need 


ulate about an 
ward leadership 
nction of social 
Ids a global 
to Binet’s 
telligence 
ying this 


nite 5 


be abandoned; reviews of the literature on 
small groups have pointed up the multiple 
possibilities (Bass, 1954; Mann, 1959). 
Global judgments of observers, however, and 
particularly of participants may uncover per- 
sonality characteristics underlying leadership 
in a wider variety of social situations; flexi- 
bility of behavior required for leadership un- 
der diverse circumstances may not be discov- 
ered if emphasis is placed upon the minutiae 
of action. 

The personality correlates of male LGD 
leadership identified in our investigation (gen- 
eral activity, ascendance, and sociability or 
“GAS”) suggest an intriguing research defi- 
nition of the issues raised. The GZ factors of 
ascendance and sociability for small group 
leadership have been cited in the tivo perti- 
nent reviews by Bass and by Mann; these 
factors have also been confirmed with projec- 
tive material (Mussen & Porter, 1959). Bass 
cited these correlates in female groups; we 
were unable to replicate for our females in 
mixed sex groups, but we did get replication 
for males in both same sex and mixed sex 
groupings. The evidence reflects the signifi- 
cance of these personality factors for males 
in a wide variety of LGD situations. The en- 
ergetic, ascendant, sociable male is likely to be 
a leader among both men and women in the 
most global sense. The failure to replicate 
for females in mixed sex groupings is ascribed 
to cultural role expectations, but the same 
factors “work,” according to the literature, 
for female leaders in female groups. 

If we can assume that the elusive concept 
of interpersonal sensitivity was to some ex- 
tent reflected in our analysis of leader and 
nonleader judging accuracy, and if LGD lead- 
ership represents important dimensions, the 
implications for research direction are fairly 
clear. Our leaders were inferior to the non- 
leaders in their accuracy of ranking both lead- 
ers and nonleaders in LGD groups, based on 
our sociometric and external ranking criteria. 
Our results support a qualified endorsement of 
the relevance of certain strong personality 
traits demonstrated in another LGD study 
(Borg & Tupes, 1958) where rated character- 
istics such as extroversion, assertiveness, and 
energy were related to leadership. Mann’s 
review (1959) of 500 personality studies in- 


350 


volving small group behavior tends to support 
this position, and he pointed out difficulties 
in dealing with such constructs as empathy 
and interpersonal sensitivity. It may well be, 
on the other hand, that personality trait cor- 
relates of leadership are a function of the 
greater popularity of this research approach 
among psychologists, leading to an exagger- 
ated emphasis as a consequence of the sheer 
number of investigations. 

To conclude, certain requirements of 
economy and standardization of LGD pro- 
cedures were urged by Bass (1954). Gleason 
(1957) reduced the need for trained observers 
by "buddy" ratings, and we have reduced 
discussion time to 20 minutes and demon- 
strated substantial reliability for male and 
mixed sex groups. Other investigators have 
employed brief discussion periods without re- 
porting reliability (Borg & Tupes, 1958; 
Kirscht et al., 1959; Mussen & Porter, 1959). 
We recommend the LGD as a relatively well 
standardized prototype of group behavior with 
strong possibilities of general predictive effi- 
ciency. Assuming the latter we are impressed 
with the potential utility of the LGD as a 
stable intermediate criterion against which to 
assess more economical pencil-paper measures 
of leadership. The LGD may well represent 
a model paradigm for getting a criterion to 
stay put in a complex area of measurement. 


SUMMARY 


Three Leaderless Group Discussion judging 
techniques involving leadership rankings by 
observers, recorders, and participants were 
applied to 228 college subjects arranged in 
19 all male and 19 mixed sex groups of six 
each. Judging-agreement within and among 
ranking conditions was high, despite the fact 
that judges in all conditions were 380 un- 
trained college observers; the sex of those 
rated and those being rated did not appear 
to affect ranking accuracy. 

When evaluated against a pooled com- 
posite ranking criterion of participant judg- 
ments, a form of sociometric assessment, LGD 
leaders were inferior to nonleaders in their 
social accuracy of judging LGD leadership. 
Although leaders failed in interpersonal sensi- 


os: y 
W.'A. Kaess, S. L. Witryol, and R. E. Nolan 


tivity as measured by the criterion above, the 
males were characterized by * GAS" — General 
Activity, Ascendance, and Sociability factor 
correlates from the Guilford-Zimmerman Tem- 
perament Survey; these relationships are con- 
sistent with published evidence reviewed bY 
Bass (1954) and Mann (1939). A GAS 
discriminant function failed to replicate for 
female leaders whose LGD positions Were 
available for analysis in mixed sex groups 
only, but the function was applicable to male 
leaders in same sex and mixed sex groups 
The Bass review has indicated similar per 
sonality traits for female leaders in same 
sex LGD groups. f 

Our findings taken together with published 
evidence suggest a reliable and economical 
approach to LGD leadership assessment under 
complex social circumstances. The definitio” 
by personality correlates of the characteri*" 
tics of LGD leadership, demonstrated in T€ 
search, leads us to recommend this sm4 
group approach not only as a useful pre 
dictor, but also as a fruitful intermediate 
criterion for evaluating other research aP^ 
proaches to leadership. 


REFERENCES 


Bass, B. M. The leaderless group discussion. psychol 
Bull, 1954, 51, 465-492. NN. 

Bass, B. M., & Norton, F. T. M. Group size ^4 i 
leaderless discussions. J. appl. Psychol. 1951, 
397—400. 7 aa 

Bore, W. R., & Tures, E. C. Personality cone pes 
istics related to leadership behavior in two Hex 
of small group situational problems. J. appl. £y 
chol., 1958, 42, 252-256. 4 lit 

GLEASON, W. J. Predicting army leadership os 
by modified leaderless group discussion. 
Psychol, 1957, 41, 231-235. 

Kirscut, J. P., Lopanr, T. M., & HaIRE, 
factors in the selection of leaders by mem 20, 
small groups. J. abnorm. soc. Psychol, 1957 
406-408. dud d 

Mann, R. D. A review of the relationships 7 
personality and performance in small E 

- Psychol. Bull, 1959, 56, 241-270. 

Mussen, P. H., & Porter, L. W. Personal 
tions. and self-conceptions associated wit 
tiveness and ineffectiveness in emergent 7 
J. abnorm. soc. Psychol, 1959, 59, 23-2 nees Ii 

WırryoL, S. L, & Karss, W. A. Sex differs gh 
Social memory tasks. J. abnorm. soc. 

1957, 54, 343-346. 


(Received December 12, 1960) 


ter" 


M. Some 
S 
E 


twee? 
oup> 


m 
motiv 
apo 
oup 


L ——  —  ——  — —Y—! 2—— 0 — ——À - 


Journal of Applied Psychology 


Vor. 45, No. 6 DECEMBER 1961 


DONALD G. PATERSON 


DONALD G. PATERSON 


1892-1961 


Applied psychology lost one of its pioneers and 
one of the major contributors to its development 
with the death on October 4, 1961 of Donald G. 
Paterson. Returning to this country after partici- 
pating in the fourteenth International Congress 
of Applied Psychology in Copenhagen, he was 
hospitalized in Minneapolis and died of cancer 
after a brief illness. 

Paterson was Editor of this journal for 12 years, 
from 1943 through 1954, being the first person 
to perform this service under the APA's owner- 
ship and management of the journal. His list of 
publications exceeds 300 titles, including the au- 
thorship or co-authorship of Minnesota Mechani- 
cal Ability Tests (1930), Physique and Intellect 
(1930), Men, Women, and Jobs (1936), Student 
Guidance Techniques (1937), How to Make Type 
Readable (1940), The Minnesota Occupational 
Rating Scales and Counseling Profile (1941), Lo- 
cal Labor Market Research (1948), Revised Min- 
nesota Occupational Rating Scales (1953), and 
Studies in Individual Differences (1961). It is 
characteristic of Paterson’s mode of working and 
of his constant involvement with his many stu- 
dents that many of his books and articles were 
published jointly with students and colleagues. 
That his methods resulted in research produc- 
tivity is demonstrated by the list of about 1,400 
titles of articles published by his students which 
was prepared and published at the University of 
Minnesota at the time of Paterson’s retirement 
in 1960. 

Donald G. Paterson was born on January 18, 
1892. He obtained his undergraduate and gradu- 
ate training at the Ohio State University. Prior 
to World War I, he was an Instructor in psy- 
chology at the University of Kansas, where he 
met and married Margaret Young. His wife and 
their two children, Philip Paterson and Mrs. Rob- 
ert C. Becker, survive him. From 1917 through 
1919 he was Chief Psychological Examiner and 


a Captain in the Sanitary Corps of the United 
States Army. For the next 2 years he was a mem- 


ber of the Scott Company, the first of psycho- 
logical consulting organizations. ? 
Paterson moved to the University of Minne- 
sota in 1921, and in 1923 was promoted to Pro- 
fessor. He retired in 1960 after 39 years of serv- 
ice. In this interval he became a leading figure 
in psychology nationally, in his own community, 
and in the University of Minnesota. He pionegret 
in the whole advance of student personnel wore 
vocational counseling, industrial and personne 
psychology, and differential psychology. He plas 
a major role in the 1930s and 1940s in the est? "d 
lishment of the University of Minnesota’s Sur 
dent Counseling Bureau, the Minnesota Employ 
ment Stabilization Research Institute, and M 
university's Industrial Relations Center. He W^ 
a founder and President of the American As 
ciation of Applied Psychology, and served E 
Secretary of the American Psychological Assoc!” 
tion for 6 years. He was a Diplomate in indus- 
trial psychology of the American Board of Ex 
aminers in Professional Psychology. He receives 
an honorary LLD degree from Ohio State pe 
versity in 1952, and in 1956 was selected to A 
liver the Walter V. Bingham lecture at his al 
mater. have 
Since 1921, approximately 300 students ne 
earned their MA degrees, and 88 students t a 
PhD degrees with him as their major E 
Additional thousands of undergraduate and ME. 3 
ate students have taken his courses ir ie B 
chology of individual differences, vocationa x 
occupational psychology, and occupational M ie 
seling. Hundreds of his students, currently Y are 
ing in education, government, and industry, a 
applying their knowledge and engaging in m 
search in areas of applied psychology. Hs e 
print on applied psychology will continue | 
seen in the work of these students, and aa ‘i 
contributions he has made to the knowle¢ P. T 
psychological principles and their applicatio 
all areas of life. 


—————— "————— a IE: 
m. E 


c os—n€— HE. ŘÁD o 


Journal of Applied Psychology 
1961, Vol. 45, No. 6, 353-358 


THE ASSESSMENT OF CREATIVITY IN A RESEARCH SETTING 


WILLIAM D. BUEL 


Science Research Associates, Chicago, Illinois 


Recently, Stein (19602, 1960b), Sprecher 
(1959), Smith, Albright, Glennon, and Owens 
(1961), and Buel (1960) have presented both 
material and methods directed to the prob- 
lems of the assessment and identification of 
creativity among research personnel in the in- 
dustrial setting. These studies illustrate sev- 
eral methods of studying creative perform- 
ance: environmental surveys, descriptive psy- 
chometrics, personal history information, and 
behavioral rating scale items validated for the 
purpose of differentiating between relatively 
more or less creative research personnel. The 
purpose of the study reported here was to in- 
vestigate the descriptive and predictive va- 
lidity of several psychometric instruments and 
à locally constructed forced-choice scale. 


METHOD 

Subjects 

. The subjects in this study were rese 
in the research center of a major cere a 
feed company.! The subjects were selected with re- 
Bard to whether their assignments were considered to 
e in an endeavor which called for creative behavior, 
Tegardiess of whether they themselves were consid- 
ered to be creative. In this manner 54 persons (48 
males and 6 females) were selected for study.? The 
teas of technical endeavor represented by these per- 
Sons Were cereal, animal nutrition, and organic re- 
Search with training in chemistry and biology as a 
comm : held the doctor's de- 


On background. Fifteen 
ess 4 ies nateni degree, 28 held the bache- 
a degree, and 18 others had training alg Saeed 
ane in their ficld but did not hold a degree. je age 
34 € group ranged from 22 to 59 with a d Bis 
o Years, Length of service with the reei de 
of e P8ed from 6 months to 27 years with a 
$ Years, 


Raters 


arch personnel 
al and animal 


! he research person- 


Tw, 
he 7DLy-two raters evaluated the T 
mY Means of the Forced-Choice a NE 
Some J-One of these raters also served as T 


igned to 
pa rater, The number of ratees assig 
ju 


Authors wish to thank Robert ¥ 
ch e Members of the Quile D ot d m 
24 Center wh this study possible. | 

M W of 65 ecd. upon later mm ud busty. 
"hoses of forced-choice validity and reliability. 
353 


AND VIRGINIA M. BACHNER 


Pure Oil Company, Palatine, Illinois 


any rater ranged from 1 to 7. All 65 subjects were 
evaluated with the Forced-Choice Instrument. In ad- 
dition, the codirectors of the research center partici- 
pated in an over-all ranking of the 65 subjects on a 
criterion to be discussed below. 


Psychometric Instruments 


The psychometric instruments employed in this 
study were as follows: 


1, Adaptability Test, Form A (12 persons) and 
Form B (42 persons). All 6 females responded to 
Form B. Forms A and B correlate .89 (Tiffin & 
Lawshe, 1954). 

2. Thurstone Temperament Schedule, Form AH 

3. Guilford-Martin Personnel Inventory 

4. Allport-Vernon-Lindzey Study of Values, Re- 
vised Edition 

5. Kuder Preference Record (Vocational—Form 
B1) 

6. Forced-Choice Instrument 


The items for the Forced-Choice Instrument were 
based on a previous study (Buel, 1960) in a pe- 
troleum research center. The items relate to research 
competence, methodology, technical application, per- 
sonal approach to complexity, perseverance, energy 
level, synthesizing ability, uniqueness of approach, 
objectivity, personal confidence, use of symbolism 
and analogy, etc. Since the format of this instru- 
ment has not previously been described in the litera- 
ture, it is briefly described here. 

As previously indicated, the items were of a be- 
havioral nature. The preference indices were in the 
form of the arithmetic means of the value (1 through 
5) which had been assigned to each item by each 
rater at a time when those items were in check list 
form and were used to describe research personnel. 
The item discrimination indices were in the form of 
Pearsonian coefficients calculated against a hypo- 
thetical criterion of creativity, identical in all but 
organizational references to the one employed in this 
study. 

The development of the tetrads for the scale was 
accomplished by matching two items on high pref- 
erence value and two items on low preference value, 
one item in each pair having significant differentiating 
ability (.05 level) and one item in each pair having 
little or no differentiating ability. In addition to this 
form of matching, an attempt was made to match 
each pair of items on standard deviation (Buel, 1959) . 
Finally, an attempt was made to match pairs of high 
and low preference value items on item length. In 
this way controls were imposed on central tendency, 
variability, and the desirability or attractiveness of 
an item as a function of length or brevity. The final 


354 William D. Buel and 
forced-choice scale consisted of 13 tetrads, the items 
in each of which were to be ranked (1-4) in terms 
of their descriptiveness for the person being evalu- 
ated. Rundquist (1950) has recommended such a 
format, and it is at least reasonable to assume that 
the unpleasantness of forcing a choice is alleviated 
by this approach. The scoring of a scale in this form 
is achieved by scoring only Rank 1 and Rank 4, 
synonymous to scoring the most and least descrip- 
tive items. 


Procedure 


The testing of personnel was accomplished by a 
qualified psychologist between 6 and 9 months prior 
to the inception of this study. All persons tested were 
members of the research center staff at the time of 
testing and had not been selected for employment by 
means of these tests. The forced-choice evaluations 
and the criterion rankings were performed at the be- 
ginning of the data collection. All test and rating re- 
sults were kept confidential. 


Criteria 


Three criteria of creativity were employed: 


1. A hypothetical criterion of creativity stated as 
follows: 


You have been selected to be director of research 
for another cereal and feed company. The Quaker 
Oats Company Research Center has agreed to let 
you take with you to your new organization cer- 
tain of the persons now at the Quaker Oats Com- 
pany Research Center. You have been instructed 
by your new management to bring with you those 
persons who will make the most significant, origi- 
nal, and lasting contributions to research, Try to 
focalize your evaluation on the three underlined 
characteristics above, disregarding the individual's 
field of specialization. 


The two codirectors of research evaluated cach of 
the 65 persons on this basis. Each man was ranked 
in accordance with instructions which required that 
the group be divided into five subgroups, 13 persons 
per group, Group A being those most accurately de- 
scribed by the criterion, Group B those next best 
described, and so on down to Group E or those 
most poorly described by the criterion. Ranks 1 
through 13 were then assigned to each member of 
each group. These letter-number combinations were 
later converted into one continuous ranking from 1 
through 65, thus obviating the difficult task of con- 
tinuously ranking 65 persons, 

Employing the technique proposed by Hull (1922), 
each subject’s rank score was converted into a linear 
scale score, amenable to treatment by linear correla- 
tion methods. The correlation between criterion raters 
was .73. Consequently, the average of each man’s 
linear scale scores served as his criterion score. 

2. Creativeness: a subsection of the company ap- 
praisal form, which was defined as follows: 


Virginia M. Bachner 


Ability to approach problems with an inquiring 
mind, with vision and imagination, to develop new 
ideas and improved methods, and to provide origi- 
nal solutions to problems. 


Ratings on this criterion were in the form of 
values from 1 to 3 and were assigned as part of the 
yearly evaluation of personnel. The raters who as- 
signed these values were in some instances the same 
persons who rated with the Forced-Choice Instru- 
ment. However, the 9-12 months between the two 
evaluations reduced any biasing effects as a func- 
tion of memory of previous ratings. t 

3. Proportional Contribution to Patents Issued: 
wherein each single author patent was scored as one 
point; if authored by more than one person, the 
point was divided equally among the several authors. 

The first criterion discussed was collected after ihe 
forced-choice evaluation of the participants, wie 
the second criterion evaluation had been performed 
prior to the inception of the study. The third cri- 
terion was obtained from personnel records. Beant 
of the temporal order in which the criteria were d 
lected, relatively little bias is assumed to have crep 
in by way of slanted test scores or forced-choice 
evaluations. 


Analysis 


Tables 1, 2, and 3 present, respectively, the validi- 
ties, intercorrelations between significant predictors, 
and criterion intercorrelations, One validity Cay 
based on 65 cases, resulted from correlating t 3 
forced-choice scores with the first criterion (HYPE 
thetical), while with the same N for split-half cale 
lations (7 tetrads vs. 6 tetrads), a reliability of she 
was obtained, increased to .92 by application of bi 
Spearman-Brown formula (an approximation 
cause of the uneven split). In all other situation 
was equal to 54 or 42, as specified. Where necessa 
adjustments were made in the ranks assigned EY red 
codirectors and subsequently in the linear scale aec 
to account for the reduced number of cases. All En 
relations are in the form of Pearson product-mom 
coefficients. 


RESULTS AND DISCUSSION 


Inspection of Table 1 leads to some non 
esting observations concerning the applica is- 
of standardized psychometric material for tive 
criminating between various levels of mance 
ability. Persons whose research contribu j- 
are "significant, original, and lasting 
terion 1) tend to be active dominant, ed 
operative; are interested in literary pure ur- 
and are uninterested in clerical pursuits. | by 
ther, they are quite effectively identifie who 
the Forced-Choice Instrument. Persons Cri- 
are considered to possess “creativeness 


co- 
its; 


Creativity in a Research Setting 


[21 
Cn 
wn 


TABLE 1 
VALIDITY COEFFICIENTS 


Criterion 1 


Criterion 2 Criterion 3 


ee 
nen 2) are more intelligent, are interested 
‘ngs literary, and are quite easily identi- 
Y means of the Forced-Choice Instru- 
Relative to the number of “patents 


Mey (Criterion 3), persons with literary 
he v: 4s well as those identifiable by means 
nt tend to 


Patent l'orced-Choice Instrume 


fieq 
Ment. 
‘Sse » 


" (Ranking) (Creativeness) (Patents) 
Predictor (N = 54) (NW = 425 (N = 54) 
Adaptability Test 22 ET 13 
Thurstone Temperament j 
Schedule: 
Active E iad .23 Jå 
Vigorous 04 -42 01 
Impulsive —.09 —.07 — 03 
Dominant .26* 22 13 
Stable 04 21 —.08 
Sociable —.03 —.03 11 
Reflective .22 21 05 
Guilford-Martin Personnel 
Inventory: 
Objectivity 20 .25 n 
Agrecableness —,20 AS = OF 
Cooperativeness ,29* -20 .06 
Allport- Vernon-Lindzey 
Study of Values: 
Theoretical 30 22 18 
Economic —.08 = —.06 
Aesthetic —07 01 07 
Social 02 05 05 
Political 08 09 Ot 
Religious -42 —24 —.20 
Kuder Preference Record 
/ocational): 
Mechanical à —i03 -41 = 
Computational Al -01 AS 
Scientific 17 09 —.09 
Persuasive a Es = = 
ey .29** .29* .24* 
E —.01 04 16 
Musical . 01 — 18 03 
Social Service : p 
Clerical mM gi m E 
Forced-Choice Instrument .65n* A 25* 
x Una. 2 reduced N by 12. 


Previously reported work lends credulity to 
some of the findings reported here. In relation 
to activity, Barron (1957) demonstrates that 
original persons tend to be more “energetic” 
whereas unoriginal persons tend to be “slow” 
and “apathetic.” Elsewhere Barron (1955) 
hypothesizes and demonstrates that *Domi- 
nance" is characteristic of original persons. 


356 


William D. Buel and Virginia M. Bachner 


TABLE 2 
INTERCORRELATIONS AMONG SIGNIFICANT PREDICTORS 
(N = 54) - 
Guilford- : 
Adapt- Thurstone: Martin: Kuder: 
ability — Coopera- = anton 
Predictor Test Active Dominant tiveness Literary Clerical 
Thurstone: Active 13 
Thurstone: Dominant —.05 22 
Guilford-Martin: 
Cooperativeness 325 221 .06 
Kuder: Literary .08 16 05 02 
Kuder: Clerical —.06 —.14 —.50 = —.21 
Forced-Choice Instrument 22 17 31 23 06 —.8 


Maizel (1958), investigating the reading 
habits of research chemists, suggests that the 
more creative are given to greater amounts of 
technical reading on the job. Albright and 
Glennon (1961) find essentially the same 
tendency when differentiating between re- 
search on a "research" vs. "administrative" 
career aspiration criterion, while Roe (1952) 
describes eminent scientists as having de- 
veloped a childhood habit of prodigious read- 
ing. Maschino (1959), considering the factors 
related to success in technical jobs, presents 
evidence to the effect that successful tech- 
nical personnel are disinterested in clerical 
work and, as demonstrated in numerous other 
research reports, are significantly more intelli- 
gent. Cooperativeness, although not specifi- 
cally corroborated by the work cited here, 
gains significance if one chooses to construe 
Ascendance (Barron, 1957) and Tolerance 
(Gough & Woodworth, 1960) as components 
thereof. Conversely, Roe's work relative to 
the researcher's need for personal independ- 
ence (1952) and lack of personal relations 
skills (1956) would seem to be contradictory. 
The fact that the correlation of Cooperative- 
ness with Criterion 3 (patents), a relatively 
objective criteria, is not significantly different 
from zero, may indicate that rater and sub- 
ject bias in favor of a socially desirable char- 
acteristic like Cooperativeness has contributed 
to the magnitude of the correlations between 
it and two somewhat subjective criteria. 
Table 2 presents the matrix of intercorrela- 
tions among the variables singled out for con- 


sideration as discriminators between relatively 
creative and noncreative research persons. 
is apparent that, with the possible excep- 
tion of Dominance and Clerical (—.50) oot 
Dominance and the Forced-Choice Instrument 
(.31), the discriminating measures are 1” 
dependent of one another, Considering AC 
tive, Cooperativeness, Literary, and Cleric# 
(Dominance contributes nothing because 2 
high intercorrelations) as a predictive p 
tery for Criterion 1 leads to a multiple a 
relation of .47 (N = 54), shrunken to 6) 
(Guilford, 1950; Kelley & Salisbury, 192 A 
significant at the .01 level of confidence. E 
lecting the Adaptability Test and Literary ds 
predictors of Criterion 2 results in a mula i 
correlation of .43, shrunken to .41, S18? 
cant at the .01 level (V = 42). m. 
Although the authors have heretofore wo. 
the Forced-Choice Instrument as a predic id 
of various criteria, it is equally justifiable 
view it in the role of a criterion. Dn à 
Forced-Choice Instrument (AN = 54) ^ ce 
criterion results in a validity for Done 
of .31 (.04 level) and for Theoretical wi 
(.06 level), correlating between them oie 
—.03 (presented here but not in tables). cor 
bining the two eventuates in a multiple 
relation of .41, shrunken to .39 (.01 f 
It appears that, even within the limits o 
study, the Forced-Choice Instrument is @ 
sonable criterion. en 
Table 3 presents the correlations petes d 
the criteria, Criterion 3 (patents) i5 moes) 
dependent than Criterion 2 (creative! 


Creativity in a Research Setting 


TABLE 3 
CRITERION INTERCORRELATIONS 


Criterion 1 2 
1 (Ranking ) (V = 54) 
2 (Creativeness) (V = 42) 64 
3 (Patents) GV = 54) EST AT 


and Criterion 1 (hypothetical) is the least in- 
dependent of the three, suggesting that it may 
be the most representative criteria. Criterion 
3 is relatively independent but at the same 
time reflects the possibility that it is appar- 
ently not measuring creativity as conceived 
of by research directors and supervisors. The 
lower correlations between Criterion 3 (pat- 
ents) and the other two criteria may reflect 
the many unaccounted for variables involved 
‘in securing patents or, perhaps, the entrepre- 
Neurship qualities operating in research per- 
Sonnel who obtain patents. 

To use a patent criteria in such a study 


' May be suspect when one recalls that certain 


Of the subjects had been members of the re- 
Search center staff a relatively short time. 
bviously such persons, even if highly crea- 
tive, had not had a chance to patent within 
this organization. Patents were included as a 
Criteria more because of availability than be- 
Cause of their purity as a measure of creative 
ility, even had all persons been staff mem- 
RIS long enough to make such a criterion 
“quitable. The authors feel that mere pro- 
"sity of patents is not a true indicator of 
Creative ability. i 
Future investigations might develop E 
verit of patents as criteria if the differentia 
dy Nes of patents were considered. If E 
Sthod or quantifying the value of epi 
Dans (perhaps by a panel consisting of Fa 
t author and several experts) were va 
>a more meaningful patent epum 
result, With regard to the ums i» 
> Subtests which approach signi can : 
Dresi € Or several criteria, but which have +i 
SUsly Been discussed, might bear further 
Die Ty and item analysis. Stable, Reflective, 
hoe lve: Aeresableness (negative); Artistic 
ti Rr DEI ative) might 
Ve), and Religious (negativ 


Nigh à 


357 


well be developed into significant indicators of 
the personality and value structure of creative 
persons. 


SuMMARY 


Psychometric instruments of varying na- 
ture were used to evaluate research person- 
nel in a major cereal and feed company re- 
search center. Three criteria of creativity 
were collected against which the psychometric 
instrument validities were calculated. Certain 
intelligence, interest, and personality variables 
and the Forced-Choice Instrument were dem- 
onstrated to be valid, singly and in combina- 
tion, for differentiating between relatively 
creative and noncreative research workers. 


REFERENCES 


ALBRIGHT, L. E, & Grennon, J. R. Personal history 
correlates of physical scientists! career aspirations. 
J. appl. Psychol., 1961, 45, 281-284. 

Barron, F. The disposition toward originality. J, ab- 
norm. soc. Psychol., 1955, 51, 478-485. 

Barron, F. Originality in relation to personality and 
intellect. J. Pers., 1957, 25, 730-742. 

Bvzr, W. D. Stability of preference indices in forced- 
choice rating scale items. Engng. industr. Psychol, 
1959, 1, 134-137. 

Burt, W. D. The validity of behavioral rating scale 
items for the assessment of individual creativity. 
J. appl. Psychol., 1960, 44, 407-412. 

Goven, H. G, & Woopwortn, D. G. Stylistic varia- 
tions among professional research scientists. J. Psy- 
chol., 1960, 49, 87-98. 

Gvirronp, J. P. Fundamental statistics in psychology 
and education. New York: McGraw-Hill, 1950. 

Hutt, C. L. The computation of Pearson’s r from 
ranked data. J. appl. Psychol., 1922, 6, 385-390. 

Kerrey, T. Lọ, & Sarıssury, F. S. An iteration 
method for determining multiple correlation con- 
stants. J. Amer. Statist. Ass., 1926, 21, 282-292. 

Mazzert, R. E. The most creative chemist reads 
more. Industr. engng. Chem., 1958, 50, 64A-65A. 

Mascnuixo, A. Factors related to success in techni- 
cal jobs. Midland, Mich.: Dow Chemical, 1959. 
(Mimeo) 

Ror, A. A psychologist examines sixty-four eminent 
scientists. Scient. Amer., 1952, 187, 21-25. 

Ror, A. The psychology of occupations. New York: 
Wiley, 1956. 

Runpouist, E. A. Personality tests and prediction, 
In D. H. Fryer & E. R. Henry (Eds.), Handbook 
of applied psychology. New York: Rinehart, 1950. 
Pp. 182-191. 


358 William D. Buel and Virginia M. Bachner 


Smitu, W. J., Avsricut, L. E, GLENNON, J. R., & 
Owens, W. A. The prediction of research com- 
petence and creativity from personal history. J. 
appl. Psychol., 1961, 45, 59-62. 

SPRECHER, T. B. A study of engineers’ criteria for 
creativity. J. appl. Psychol., 1959, 43, 141-148. 
Stems, M. I. The research environment survey. Chi- 

cago: Science Research Associates, 1960. (a) 


Stes, M. I. The survey for administrators. Chicago: 
Science Research Associates, 1960. (b) 

Trey, J., & Lawsur, C. H. Examiner manual for 
the Adaptability Test. Chicago: Science Research 
Associates, 1954. 


(Received December 9, 1960) 


eg — ————JI——— 


Journal of Applicd Psychology 
* 1961, Vol. 45, No. 6, 359-363 


PILOT JUDGMENTS OF 


SIMULATED COLLISIONS 


AND NEAR MISSES: 


A COMPARISON OF PERFOR 


MANCE WITH UNCODED AND 


TWO-TONE CODED MODELS 


JOHN E. ROBINSON, JR., KENNETH G. COOK, ann CHARLES E. ZELENY 
Applied Psychology Corporation 


Aircraft pilots agree that avoiding midair 
collisions in Visual Flight Rules weather ' in- 
volves more than merely sighting the intruder 
aircraft; but they do,not agree on just how 
much more: how many kinds of information 
are needed and what degrees of precision are 
required (Calvert, 1958; Robinson, 1959). 
This question continues to arouse arguments 
in ready-rooms, at manufacturers’ confer- 
ences, and at flight safety meetings and regu- 
latory deliberations. i 

Projector and Robinson (1958) in an ana- 
lytical survey of aircraft navigation light sys- 
tems list eight kinds of information mentioned 
in aviation literature as important, to some 
unspecified degree, in avoiding collisions dur- 
ing “good” weather conditions (Visual Flight 

ules) ; 


Presence or location Heading or course 


Identification Airspeed 

Attitude Distance or range 

Relative altitude Intended maneuver 
Presumably, it was felt that if a pilot could 


Bet these kinds of information about “in- 
tude» aircraft, he would be able to make 
ster, more accurate decisions about the pos- 
“ibility of collision, and thus, whenever avoid- 

Ce was necessary, make the correct ma- 
"Ver sooner and more accurately. —. ; 
Tesearch approach to the evaluation o 


t . H 
the Pasic concept would seem to begin hig 
ay Measurement of visual percepuon xi 
sp object moving through three-dimensiona 
ib ©: A first task would be to reso s 
i an : isual perceptio 

` peer Ma Subsequently, 


€ or al objects. 
Derce, 9T elemental obj more complex 


Vig, Don o jects with 
af $ old eu < would be measured, 
h ter Cd besa ] information 


aq veditiona] items of visui 
T provided. 
mal 


arg 60, Civil Air Regulations, with amendments. 


Aircraft shape has been shown to be ad- 
vantageous in the perception of relative mo- 
tion when compared to the relative motion 
of a spherical object (Kelley, Bowen, De- 
Groot, Frank, & Channel, 1959). Individual 
observers were seated in a corner of a black 
curtained “room.” A visible stimulus object 
whose support was invisible was moved across 
the room in a random sequence of straight- 
line directions. The observer had to indicate 
his perception of the relative motion of the 
object. His answers were scored on the basis 
of speed and accuracy. The first object was 
a small, silver painted sphere. Very little 
visual information was available from the 
shape of the stimulus, and the scores re- 
flected certain amounts of erroneous relative- 
motion judgment. It was reported that when 
the sphere was replaced with a small, silver 
painted aircraft model, judgments of relative 
motion were made significantly faster and 
more accurately. 

The experiment reported in the present 
text? was designed to investigate and pro- 
vide the answer to the next logical step in 
this research progression: Will the relative 
motion of aircraft models painted half dark 
and half light be perceived sooner and more 
accurately than the relative motion of one 
painted all light? 

The answer was sought through visual pres- 
entation of a number of simulated collisions 
and near-collisions, using an F-100 flight 
simulator and an F-151 aerial gunnery trainer 
attachment. Pilots who served as subjects 
used apparatus which made continuous rec- 
ords of their judgments during the simulated 


art of research under Contract No. FAA/BRD- 
ysis and Comparison of Visual Collision- 
Systems, funded by the Bureau of Re- 
search and Development (now, Avi: ion Research 
and Development Service), Federal Aviation Agency, 


359 


360 


flight situations. The three judgments pos- 
sible were “Undecided,” “Collision,” and 
* Miss." 


The paint patterns used on the intruder 
aircraft models were in no way intended to 
presage operational paint schemes for full- 
sized aircraft. Rather, they were designed to 
code certain visual aspects of the aircraft to 
see if such coding would help pilots make 
better judgments than could be made with an 
uncoded model. If the aspect information had 
turned out to be of value, application would 
have been developed for both daytime opera- 
tions (paint) and nighttime operations (lights). 


PROCEDURE 


Each pilot who participated in this study solved 
his test problems while seated in the cockpit of an 
F-100 jet fighter in one of the simulator rooms at 
the Federal Aviation Agency's National Aviation Fa- 
cilities Experimental Center, Atlantic City, New Jer- 
sey. In front of this cockpit was a concave plaster 
screen, approximately two stories in height and ex- 
tending 240 degrees in azimuth. The general level of 
lighting represented moonlight conditions (7 to 10 
foot-lamberts ambient light), No terrain features 
were visible, but a fuzzy horizon was distinguishable 
at the border of the “moonlit” area. 

A closed circuit television System was used to pro- 
ject an image of a scale model B-47 aircraft in a 
series of different courses relative to the course of 
the cockpit. These included flight towards the F-100 
and away from it on one of six relative headings. 
Relative altitude of the courses was also varied, with 
the B-47 image being made at various times to ap- 
pear to be ascending towards, descending upon, or 
flying level with the F-100. The outcome was some- 
times a “collision,” at other times a “near-collision,” 
and at others a clear miss. 

The “intruder” image appeared on the screen as a 
white area with a brightness of .55 foot-lamberts. 
The brightness of the “sky” (the area of the screen 
above the horizon line) was .1 foot-lambert. The 
maximum simulated range was 24,000 feet, at which 
distance the television image of the B-47 appeared 
only as a white dot. At distances closer than the 
minimum possible range (600 feet) the intruder 
image disappeared. 

A separately located television camera was focused 
on the scale-model aircraft, which by means of a 
mount could be moved various distances from the 
camera to simulate changes in the intruder’s dis- 
tance. A gimballed suspension system permitted ma- 
neuver of the scale model in pitch, roll, and yaw. 
The amount and speed of rotation in each of these 
three flight dimensions, or any combination of them, 
were controlled by analog computer equipment ; 
complex driving mechanisms assured that the ap- 
parent motion and maneuvers of the intruder im- 


J. E. Robinson, Jr., K. G. Cook, and C. E. Zeleny 


age were faithfully maintained with respect to the 
cockpit in which the pilot sat. 

The fact that computer settings could be made to 
start a predetermined flight situation, and carry it 5 
completion, made it possible to have standardized s 
problems. Thus, the selected flight situations cou E 
be presented repeatedly with reasonable assuranc 
that successive presentations were identical. a 

Six FAA pilots (two military, four civilian) S 
as subjects. All had had flight experience greatly a 
excess of the 1,000-hour minimum which the investi 
gators established. All were actively engaged in fly- 
ing at the time of the experiment, and the nature o 
their duties at the experimental center supports E 
belief that all possessed more than average interes 
in experimentation aimed at improving air safety. , 

Each subject was provided with 2 hours of OT 
entation and practice prior to participating in four 
test sessions. Each test session contained 6 practice 
problems and 12 test problems involving one mode" 
The order in which the models were assigned to b 
sessions as targets was counterbalanced to prev 
bias from learning or practice effects. with in A 
vidual problem times varying from 34 to 72 secon y 
plus the time required to reset the computer for €? ie 
new problem, single test sessions lasted app"°* 
mately 50 minutes. A 

As soon as a problem was ready for presentatitr» 
the experimenter alerted the pilot (“Your next pre 
lem will appear soon at approximately 10 oos 
and level"). The pilot was not required in any jc 
to make evasive maneuvers, but instead was aS d 
only to judge whether or not the B-47 image wel 
“collide” with his aircraft, assuming that speeds ^ 
flight paths would remain constant. ned 

By pressing one of three keys on a board straPh m 
to his knee, the pilot indicated either that he ae. 
was not on a collision course, or that he bic. in 
decided. The keys were wired to a pen recorde ros 
the control room. The resulting pen-trace record E " 
vided a means of measuring the time and E lin 
each judgment. A second pen, wired in circu! Na 
the master switch, recorded the exact starting 
of each problem. ‘ coded ti 

The intruding B-47 images were visually os els 
four separate patterns. One of the scale Dr 
(Model A) was painted entirely white, ther ements 
quiring the pilot to make his collision. JU to am 
only on the basis of apparent aircraft structu! ented 
relative motion. The other three models were P 
in two-tone schemes, as follows: 


Model B: front half ..........s 

back half .... E 
top half 
bottom half 
left half 
right half 


Model C: 


Model D: 


ent?" 
The complete experiment. consisted of two pe fou 
tions each of 36 test problems for each of rob! jem 
intruder models. The total number of Pi m 
available for analysis from the six pilots inu 


\ 


ia E a 


Pilot Judgments of Simulated Collisions 


* to 288. (The 144 practice problems were not included 
in the analysis of results.) 

Pilot judgments were scored in terms of accuracy 
and speed. Accuracy scoring consisted of “correct” 
and “incorrect” categories, but were further classified 
in terms of initial judgments (decisions made earliest 
in each problem) and final judgments (the decision 
that was being indicated at the end of each prob- 
lem). 

For scoring of speed, five categories were deline- 
ated: (a) “Search” time (from start of problem to 
beginning of Undecided trace), (b) “Undecided” 
time (total time the pen trace was in Undecided 
area), (c) time to reach initial collision judgment 
(the sum of Search and Undecided times), (d) “Cor- 
rect Response" time (total time the pen trace was 
indicating the correct problem outcome), and (e) 
"Incorrect Response" time (total time the pen trace 
was indicating an incorrect problem outcome). Be- 
cause there were different time lengths among the 
36 problems, actual time values were transformed 
into proportions of total problem time, and analysis 
Was based on the proportional values. 

After all the pilots had completed all test sessions, 
individual interviews were conducted to obtain in- 
formation relevant to the data that had been col- 
lected, Topics included in the Interview Guide were: 
Adequacy of training, convenience of response ap- 
Paratus, realism of the visual presentations, nature 
of paint-coding that yielded best judgments, chances 
of correctly anticipating particular flight. situations, 
Appropriateness of test session length, and bases on 
Which judgments were made. 


RESULTS 


The analysis of the data resulting from the 
“periments can be discussed conveniently 
Under three topics: accuracy of judgments, 
(eed of judgments, and effect of flight situa- 
lon, 


Table 1 summarizes the results of scoring 
ins, Pilot judgments for accuracy according to 
en vidual pilots and individual target paint 
emes. The numbers of correct judgments, 
Wert initial and final, indicate that the pilots 
a highly skilled. Of the 288 initial judg- 
(ots 201 (70%) were correct; while ES 
sudo) of final judgments were right. 
jects, as individuals, varied widely in i 
460, of initial judgments, rangms Im 
t D 83% correct. On final responses, ud 
Correct € six pilots were very nearly equally 


ud : o to 98%. The 
ariago Tanging from 85 Jo y classified ac- 


Song. On i curac. 

d n response ac ree 
mani to a ra paint-coded models » 
Pone both for initial and final judgments. 


z "m 
Initial jugements ranged between enga 


361 


TABLE 1 


ACCURACY OF PILOTS’ JUDGMENTS 


Model paint-coding 


Top- 
Front- bot- Left- All 
Pilot| White back tom right models 
Initial judgments 
1 6 1 6 3 22 
2 5 6 9 6 26 
3 8 9 12 8 37 
4 11 11 8 9 39 
5 12 i0 8 10 40 
6 9 11 7 10 37 
All 


pilots! 51 54 50 46 201 


Final judgments 


1 10 12 11 12 45 
2 10 11 10 10 41 
3 10 10 12 12 dt 
4 12 11 9 10 42 
5 12 12 ll 12 47 
6 10 10 11 11 42 
All 
pilots 64 66 64 67 261 
Note.—The maximum possible values are as follows: single 
cell, 12; “all models" cell, 48; “all pilots" cell, 72. 


and 75%, while on final judgments the cor- 
rect responses ranged from 89% to 93% of 
the 288 problems. 

Examination of the results of scoring the 
pilot judgment records for time of decision 
reveals an interesting pattern of variability. 
Four of the six pilots required nearly identical 
amounts of time to detect the targets, while 
two pilots were markedly faster than the 
others. When Search time and Undecided time 
are summed, a considerable spread in propor- 
tion of problem time (23-69%) was evident 
among the six pilots. The variation in the 
proportion of time spent in making Correct 
judgment responses ranged from 2796 to 
54%; the time spent in making Incorrect re- 
sponses varied from 4% to 26% of total prob- 
lem time. Only in the case of Undecided time 
was the variability among pilots statistically 
significant beyond the .01 level. The different 
paint-codings did not cause statistically sig- 


nificant variation in the proportion of total 


362 


problem time spent on any response cate- 
gory except for Search time. Here, all of the 
two-tone models show longer times than the 
all-white model. This undoubtedly resulted 
from the fact that the “dark” portion of the 
two-tone models (which was in actuality a 
mottled gray) reduced the total brightness of 
these images as compared with the brightness 
of the all-white image. 

Examination of scores to determine the ef- 
fect of relative flight path on the accuracy and 
speed of pilots’ judgments reveals that this 
variable did not appear to affect the responses 
significantly. Although individual differences 
occur among pilots and among categories of 
data, the variability does not reach statisti- 
cal significance at conventional levels. Con- 
sideration of the variability in practical or 
operational terms similarly shows no differ- 
ence among responses to the several targets 
that would warrant support of the two-tone 
codings as beneficial to the visual evaluation 
of collision threats. 


The amount of vertical separation in flight 
path (at the completion of the problems) ap- 
pears to have a general relation to the speed 
of pilots’ judgments, That is, examination of 
the Undecided time reveals that the greater 
the vertical miss, the shorter the indecision. 
The closest vertical miss (400 feet) and both 
of the horizontal crossover misses required 
the longest times for making decisions. 

In the postexperiment interviews, most of 
the pilots felt they did best with the all-white 
model, and they felt that the top-bottom 
coded model was probably most difficult. 
Their comments about the orientation and 
training were all favorable, and indicated 
that a little less training might have been 
adequate. The response apparatus was judged 
convenient to use and free from ambiguity. 
The length of the test Sessions appeared to 
produce no fatigue or feelings of unreasonable 
demands. 

Comment on the realism of the problems 
pointed up the fact that seeing a bright air- 
craft against a dark sky—as if a distant spot- 
light were illuminating only the intruder— 
would be somewhat rare in nature. Neverthe- 
less, the pilots stated that the appearance of 
the problems was sufficiently realistic to per- 


J. E. Robinson, Jr., K. G. Cook, and C. E. Zeleny 


mit their use in this phase of collision avoid- 
ance research. 

In reaching their collision judgment de- 
cisions, all pilots reported that their basic 
method was to look for relative motion of the 
intruder image. If the image appeared to have 
any perceptible movement, the chances Were 
good that a miss would result. Since the cock- 
pit was perfectly steady during the problems; 
the relative motion cue could be (and was) 
utilized to great advantage. 


CONCLUSIONS 


It is felt that two major conclusions ma 
be made from the data obtained in this in 
vestigation. F 

First, since the accuracy of collision judg- 
ments of visually coded targets showed no EC 
tistically significant improvement over ju e 
ments made with an uncoded target, sea 
coding of the type used here does not appen 
to be a substantial aid to the pilot who mnei 
make a visual evaluation of midair collisio" 
threats. slots 

Second, the consistent reports of the ge 
in the postexperiment interviews indicate b 
their fundamental visual cue for making sive 
lision-or-miss decisions is perceived relat! 
motion of the target image. ave 

The investigators believe both findings ha 
important implications in determining re 
amount and kinds of information a p op: 
quires to avoid midair collisions in VF 
erations. 


SuMMARY 


An F-100 flight simulator and an 
aerial gunnery trainer with a black-and 
closed circuit television system were hes in 
present standardized visual flight pm 
volving a variety of relative course SItt es IP 
and altitude changes, resulting sometin? viet) 
“collisions” and sometimes in one of a V? 
of “near-collisions.” ~ total of 

Six experienced pilots completed a 88 
144 training and practice problems a” urine 
test problems. Objective records taken easte 
each problem permitted quantitative M = ich 
ment of the accuracy and speed with com 
the pilots were able to identify the 9" 
as “collision” or “miss.” 


r5 
-white 


7 


7 ee 
C——————— 


Pilot Judgments of Simulated Collisions 


Four target images, all generated from 
Miniature B-47 models, were presented in 
counterbalanced test session order and in ran- 
dom sequences of problems. One model was 
painted all white, the other three were treated 
with dark and light paint to identify pairs of 
the cardinal aspects of any aircraft: top versus 
bottom, front versus back, and leít versus 
right. 

The objective of the investigation was to 
determine whether or not these visual codings 
would improve pilot ability to judge the out- 
comes of the simulated flight situations. 

Statistical analysis was accomplished in 
terms of descriptive values for accuracy and 
speed, and of inferences drawn from analysis 
of variance, No statistically significant differ- 
ences relevant to the objectives of the study 
were obtained between the two-tone coded 
models and the all-white uncoded model. The 
investigators believe it is appropriate to con- 
Clude that visual coding of the type studied 
is not of significant aid to a pilot in visually 
determining the existence of a collision threat 


363 


or a safe passage; and, that pilots rely on per- 
ception of relative motion more than any 
other factor in judging the probability of col- 
liding with another aircraft. 


REFERENCES 


CALVERT, E. S. Some operational aspects of the use 
of aircrew-interpreted devices for preventing col- 
lision in the air. (International Civil Aviation Or- 
ganization, Working Paper AIR C-WP/49, 17 July 
1958.) Presented to the Second Meeting of the 
Airworthiness Committee, Montreal, Canada, July 
3, 1958. 

Kerrey, C. R, Bowen, H. M, DrGmoor, S. G., 
Frank, P. K, & Cuannet, R. C. Relative motion: 
II. The nature of relative motion situations. USN 
Train. Dev. Cent. tech. Rep., 1959, No. 316-1. 

Projector, T. H., & ROBINSON, J. E., Jn. Midair col- 
lision avoidance with navigation light systems. 
Washington, D. C.: Airways Modernization Board, 
1958. 

Rosson, J. E., Jn. Human engineering tests of se- 
lected aircraft anti-collision light systems. Wash- 
ington, D. C.: United States Navy Bureau of Aero- 
nautics, 1959. 


(Early publication received April 24, 1961) 


Journal of Applied Psychology 
1961, Vol. 45, No. 6, 364—368 


VARIABILITY OF STROKE WIDTH WITHIN DIGITS 


CHARLES L. HUGHES 


International Business Machines, New York 


Typically, studies of stroke width of nu- 
merals have used a uniform stroke width over- 
all. Chapanis, Garner, and Morgan (1949) 
suggested that this may not be the optimum 
condition, and they outlined some further re- 
search needed in which the various compo- 
nents of the digits are emphasized by varied 
stroke width within the numeral. This offered 
a new dimension in digit design. Studies di- 
rectly relating to the present problem are in 
short supply, with the exception of an article 
by Soar (1958). Starting from an analysis of 
research results showing the confusion result- 
ing from the use of conventional arabic nu- 
merals, Soar developed new sets of arabic 
digits in which parts of the numerals were 
distorted in form and/or stroke width to em- 
phasize a difference between them. The com- 
mon elements were minimized and the unique 
elements were emphasized. Six of the experi- 
mental digits proved to be significantly more 
visible than the usual arabic digits. 

The relative readability of different digits 
depends on the criterion used as a legibility 
measure, the style or form of the numerals, 
and the stroke width. The digits 0, 3, 6, and 
9 are most often confused with each other. 
They usually have curved rounded outlines. 
In most studies the digit 7 generally has the 
best visibility. This gives part of the rationale 
for the trend toward angles and away from 
curves. “Angularity” is the term applied to 
the de-emphasis of roundness or curvature of 
form and the use of sharper curves and 
straight lines in the numeral, as in the 0, 3, 
6, and 9 versus the 7. 

Two earlier experiments by Soar (1955a, 
1955b) served to fill in the background lead- 
ing up to the present problem. Significant in- 
teraction between stroke width, illumination 
level, and figure-ground contrast was demon- 
strated, with stroke width to height ratios the 
same overall (constant even stroke over the 
entire figure). Conventional arabic numerals 


were used, but modified in form somewhat to 
more angularity. 


Additional evidence has been compiled 
which suggests that angular straight-line fig- 
ures are more readable with symbolic digits 
receiving increasing but contradictory atten- 
tion. Symbolic digits are simplified straight- 
line figures that have cognates in the arabic 
numerals. They have no curved lines. Some 
figures that are in between in form (dis- 
torted, angular digits) were used by Berger 
(1944a, 1944b). 

In an often cited study Mackworth (1944) 
devised some new numerals and letters that 
simplified the components and curvature 9 
conventional forms. The new angular digits 
were readable at significantly greater distances 
than the conventional ones. Several other stud- 
ies have made comparisons of the relative VIS!" 
bilities of different designs, namely, Brow 
Lowery, and Willis (1951) and Lansde 
(1954). McLaughlin (1948) compared some 
standard sets of digits against new designs 
having variability of form and stroke widt 
within the numerals. The results show E^ 
the form of the digits was not significant, Dv 
the within-varying stroke width was. The 
at odds with the previously reported studi 
where form and even-overall stroke Me. 
were both significant. The results are furt 
confused by a retest of McLaughlin's Es 
(Soar, 1958) in which the digit forms ys 
significant. Varying stroke widths wid 
ventional numerals complicates the pro idth 
so that the curvature and stroke W 
changes cannot be separated. c 

Many studies have been done on the EC 
of stroke width to height ratios mo 
bility. In all cases the same stroke widt ae 
used for the entire set of digits and we ae 
varied within any set or within any 1953)! 
digit. Berger (1944a, 1944b), Brown (1 "ar 
Brown and Lowery 1949) all obtained $! uck 
results. In addition Shapiro (1951); Tright 
(1944), Craik (1941), Kuntz and 1 ab- 
(1950), Forbes and Holmes (1939) 7 coke 
tained similar results with optimal E “6 
width to height ratios falling between 


364 


Variability of Stroke Width within. Digits 


and 1:8. Actually only McLaughlin (1948) 
and Soar (1958) have varied stroke widths 
within digits. Furthermore, as we have men- 
tioned they could only make a partial investi- 
gation of the relative contributions of stroke 
widths and form variation to readability of 
digits. 

All these studies have included variations 
of conventional arabic numerals. Cohen and 
Webb (1953) suggested symbolic numerals 
made from a six-element straight-line matrix. 
These particular symbolic digits did not give 
as good a performance as conventional nu- 
merals. The study did result in a suggested 
eight-element straight-line matrix which was 
tested (Allusi & Martin, 1958; Allusi & Mul- 
ler, 1956). Performance on these numerals 
was not very different from performance on 
conventional numbers. The symbolic digits, 
however, offer a way to approach the prob- 
lem of form, within-variable stroke width 
and their interaction. 

The primary conclusions which may be 
drawn from the previous research on digit 
design cited above would appear to be as 
follows: 


l. The degrees of boldness of stroke lie 
Within the range (SW/H ratios) 1:6 to 1:8 
for best visibility. i 

2. Overall width to height ratios may range 
ftom 1:1 to 7:10. 

3. Angularity increases v! 

4. Increased inclosed white sp: 
bility, 

5. No adequate theory Or hypothesis has 
Yet been formulated to account for the differ- 
“Nees in visibility demonstrated in research 
Studies. 2 


es visibility. 
ace aids visi- 


Questions which remain to be answered are: 
1. What are the relative visibilities E: con- 
ional arabic numerals and symbolic nu 
erals» 
tig Do form and stroke widt 
lt Derception? ] 
9 some digits require 
"ligit Width for best visib 
^t i6 (should stroke width vary 
4 digits)? 
‘troka o some digits re 
Width within that P 


h interact in 


ee greater or lesser 
‘Oke jlity than other 
within a set 

ire a variation of 
qui t digit. 
articular agit, 


TYPES 


zvon 


5 $9 ¥ 2 |S 9$ s 


o o u nu a n n [a] 


Fic. 1. Configurations, 


through stroke width emphasis of some part 
of the digit for best visibility (should stroke 
width vary within a single digit)? 


PROBLEM AND METHOD 


The present study was an attempt to study sys- 
tematically the effect of variations in within-variable 
stroke width, conventional versus symbolic digits, 
emphasizing the unique aspects of digits through 
stroke boldness, angularity versus curvature, and the 
interaction of these variables. 

Subjects. The subjects (Ss) were all male students 
enrolled in a course in Engineering Psychology, all 
with normal vision or with optical correction, 

Apparatus. The stimuli were a set of cards on 
which the digits were drawn, one digit per card, each 
digit 41” high and were of two stroke widths. The 
stroke width to height ratios (SW/H) were either 
1:6 or 1:8. These stroke widths then were either em- 
phasized (1:6) or unemphasized (1:8). Eight "types" 
of digits, 10 of each (1 through 0) were used for a 
total of 80. Some, such as the 1 and 0, were dupli- 
cates of other types and were included to prevent 
inbalance in the frequency of presentation of each 
form and as an offset for guessing habits on the sub- 
jects. The types were (see Figure 1): 


Type A—conventional arabic, even-overall stroke 


width, SW/H 1:8 
Type B—conventional arabic, even-overall stroke 


width, SW/H 1:6 
Type C—symbolic, even-overall stroke width, SW/ 


H 1:8 


366 


Type D—symbolic, even-overall stroke width, SW/ 
H 1:6 . 

Type E—symbolic, emphasized horizontal compo- 
nents, SW/H 1:6 and 1:8 i 

Type F—symbolic, emphasized vertical compo- 
nents, SW/H 1:6 and 1:8 : . 

Type G—symbolic, emphasized unique compo- 
nents, SW/H 1:6 and 1:8 . . 

Type H—symbolic, special design with only unique 
components, the common elements absent, SW/H 1:6 


Training. Two weeks and one week prior to the 
data gathering the Ss were familiarized with both the 
conventional and symbolic digits. Specific training 
using the actual stimulus cards were given to all Ss 
and the method used in the design of the digits was 
fully described. The level of learning achieved was 
tested by having all the Ss individually reproduce 
accurately all of the configurations used in the study. 
The symbolic digits have cognates in the conven- 
tional arabic numerals, facilitating positive transfer 
of training. 

Procedure. Ss were given individual trials, and a 
complete trial for an S consisted of one round 
through the entire set of stimulus cards. The cards 
were reassembled in random order for each S's trial. 
The initial stimulus to S distance was 21 feet, and 
this was decreased by one-foot intervals until the S 
could read the digit correctly. The stimulus cards 
were illuminated at 100 foot-candles with all of the 
test room conditions closely following optimum 
visual conditions. The Ss were given instructions not 
to give a response until they were reasonably con- 
fident or could see the digit well. Wrong responses 
were not scored, and the experimenter (E) gave no 
reply to errors; however, the instructions were re- 
peated. Only scattered guesses occurred and not more 
than twice by any one S so there was no evidence 
to show any S was using an elimination process to 
guess responses by overserving E's behavior on wrong 
responses. After a correct response, the S returned 
to the starting position and the next stimulus card 
in the series was presented. This routine continued 
the same for the entire set of stimuli. Data were re- 
corded in terms of distance in feet at which the S 
read the stimulus card correctly. 

Stalistical Analysis. The effects of form, stroke 
width, and the interaction were analyzed by factorial 
analysis of variance (for just the complete symbolic 
digits, Types C through G, omitting the conventional 
numerals and the special incomplete symbolic de- 
signs). The analyses of all types were done by 10 
Separate treatment X subjects analysis of variance 
tests, one for each form, 1 through 0. In addition 10 
Duncan's shortest significant range tests were done 
to indicate which particular means were significantlv 
different from which others, 

Definition of terms. Throughout this report the 
following terms will be used in accordance with the 
definitions given here: 

Form—the shape or contour of the digit irrespec- 
tive of the stroke variations (eg, the 2 has one form 
and the 3 has a different form, etc.) 

Stroke Width—the ratios of SW or line emphasis 


Charles L. Hughes 


TABLE 1 
RELATIVE EFFECTS OF Form AND 
SrROKE WiprH 


Factorial analysis of variance 


Source di SS MS F 
Form 9 990 110.00 d 
Stroke Width + 40 10.00 =< 
FX SW 36 373 10.36 2.89* 
Subjects 10 3786 378.60 < 
EXS 90 671 7.46 zs 
SWXS 40 142 s 


FXSWXS 360 12901 (error term) 


Note.—Types C through G only, 
b <.01, 


(es, 1:6 bold stroke, emphasized; 1:8, unempha- 
sized, nonbold stroke) 4 

Configuration—the total or whole digit including 
iorm and SW variations (e.g, the entire stimulus 
complex, the total stimulus) 

Component—any part or stroke or contour of a 
digit (e.g., the stimulus elements, the diagonal of the 
7, the lower loop of the conventional 5, etc.) p 

Uniqueness—the condition of having components 
or clement combinations that are not present in other 
configurations, either through changes in SW or form 
(c.g., only the 7 has a top horizontal bar and a di- 
agonal stroke in combination, or combinations ° 
stroke width variations and arrangement of lines) 


RESULTS 


Table 1 gives the factorial analysis of vat” 
ance and shows form and SW interacting p 
nificantly (.01 level of confidence) for e 
symbolic digits. The different forms did Ln 
give significantly different visibilities ise 
stroke width was held constant. Form chin 
and SW variations together accounted for er 
ferences in perceptibility of the configuration” 
In other words, increased perceptibility -cy 
digits can be obtained by varying the iut 
emphasis of certain forms. Contrary tO Pn 
studies, the symbolic digits tended to be ™ 
perceptible. 


DISCUSSION 


mas and 
The configuration of SW variations atly- 
form that was more perceptible varied gre 


~ each 
No regular patterns appeared. In eae 1 
digit configuration is a strictly unique + de- 


The part of a digit requiring emphas! 

pends on that digit. The 
A few other clues can be pointed out. eri- 

factor “openness” suggested by other exp 


om ec 
—————— 


Variability of Stroke Width within Digits 


Ments is the same for all symbolic types so its 
validity as a perception factor is questionable. 
The results demonstrate that varying SW 
Within a digit may increase visibility. In most 
Cases the symbolic configurations were more 
perceptible than either of the conventional 
types. Comparing some of the digits, the 
variation and lack of consistency from form 
to form with the best SW configurations is 
demonstrated. The differences in configura- 
tion in arabic and symbolic 7’s are small; 
however, there were differences in the visi- 
bility. This raises a question on the use of 
anything other than straight lines for digit 
construction. The differences in narrow over- 
all stroke and the wider overall stroke con- 
figurations give evidence that for some digit 
forms a narrower ratio is better and for others 
a wider stroke gives greater visibility. This 
would relate to the interaction effects demon- 
Strated. Simplicity of form shows up as an 
asset to perception in several cases. For the 
0, all the symbolic types were significantly 
More visible than the two conventional arabic 
0's. The only difference in the shape of the 
Conventional and symbolic 0’s is that the for- 
Per is circular and the latter is square. Com- 
Dar this with the curved versus straight 7 re- 
Sults noted above. . T— 
Consider the problem of the interaction o 
Stroke width and the shape of the digit. The 
ata show that the same shaped digits had 
“ferent levels of visibility depending upon 
the Particular emphasis of SW made. The de- 
"mination of interaction effects of these two 

^ b significantly effect the 
“sibility of the symbolic digits. Whereas in 
Dr vious ; » idth was varied the 
studies stroke W il -each sét 

of ,6 OVerall each figure, and overa ination 
wig? digits as well, we allowed for a Y spo 
fhin each set, and within each numer? : 


: ious stud- 
; The p : ds with prev! 
les esults are at od (being the same 


Ove, 'eViously stroke width S, design] 
Wag cl each digit in other researc S ds 
Ove, CCOndary in effect to form ae res- 
al SW/H ratios were observed. The p 


Isi 5 roke width 
Witp; “dies show that varying the SB, and 


Sach digit can increase visi 
? r. Inter- 
tig, Dtm alone is not the gere mn m 
of form and stroke width. c Ed 
th s Influence on the peret 4 P 
i ibuti ke wi 
à intertibutions of form, stroke j 


action take no regular pattern. 


jatlables shows they d 


Sa 


367 


The significant interaction of form and SW 
makes an interpretation of the effects of 
either difficult. Differences in form must have 
some influence on perceptibility however. The 
interaction of form and SW presumes some 
significance of form. Some types of a particu- 
lar form were more significantly visible than 
other types. An inspection of the results seems 
to show that the stroke width effects have 
more influence on perception than the changes 
in form. This interaction brings many prior 
studies into a different light, since previously 
only one or the other (form or SW) was 
varied, not both. Established optimal SW/H 
ratios are possibly in need of a recheck. 

Other studies typically determined the best 
stroke width to height ratios for an entire set 
of 10 digits, but in the present study these 
were varied from digit to digit. The results 
indicate, however, that some digits were in- 
creased in visibility by varying the boldness 
of the stroke width within individual digits, 
representing a major departure from past 
methodological and theoretical approaches to 
the study of numeral visibility. There is seem- 
ingly no reason, other than an esthetic one, to 
maintain a constant stroke width for any one 
digit or any set of digits. The results of the 
study show that within variable stroke width 
does increase numeral visibility. 


SUMMARY AND CONCLUSIONS 


As the complexity of visual displays and 
the demands of the machine upon the man in- 
crease, there is a real danger that the require- 
ments will surpass the abilities of the human 
operators. One area of research in human fac- 
tors concerns methods of presenting informa- 
tion to the operator in the most perceptible 
form. One small but important aspect in this 
problem is a design of digits to be used in 
the visual displays. 

The results of previous research on digit de- 
sign were summarized as follows: The opti- 
mum stroke width to height ratios have been 
apparently determined, angularity of digit 
shape increases visibility, and the openness of 
digits aids the discrimination. > 

Certain questions about digit design were 
approached in this study: What are the rela- 
tive visibilities of conventional and symbolic 
digits? Do form and stroke width interact? 
And, most importantly, will varying the stroke 


368 


width within a set of digits and also within 
any particular digit increase the percepti- 
bility? 

The problem was to determine the effects 
of the above factors through a systematic 
variation of the stroke within a set of sym- 
bolic digits (straight-line, angular forms con- 
structed from an eight-element matrix having 
cognates in the conventional digits). 

The S-to-stimulus distance was varied by 
decreasing intervals until the S could cor- 
rectly identify the digit. Eight types of digit 
configurations plus conventional arabic nu- 
merals were used. The 80 digits were pre- 
sented, one at a time, in random order to the 
Ss. Data were recorded in terms of the maxi- 
mum distance at which the S could correctly 
identify the digit. 

The results indicated that the symbolic nu- 
merals were significantly more visible than the 
conventional digits, contrary to prior reports. 
Form and stroke width showed a significant 
interaction, supporting the conclusion that 
stroke width could vary within a set of digits 
or even within a single digit so as to increase 
the discriminability. 

We can generalize to say that each digit 
must be designed on an individual configura- 
tion basis taking form, stroke widths, and 
their combinations into account. Two factors 
must be considered: (a) absolute recognition 
of the digit as such, (5) differentiation (by 
relative emphasis) of features that suggest 
similarity (hence ambiguity) of one digit 
each of the others. 

Questions still unanswered are: what are 
the effects of decreasing illumination to near 
thresholds, tachistoscopic presentation, and 
the relative perceptability of groups of the 
symbolic digits? 


REFERENCES 


Arrusr E. A, & Martın, H. B, An informational 
analysis of verbal and motor responses to sym- 
bolic and conventional arabic numerals, J. appl. 
Psychol., 1958, 42, 79-84. 

Arrusr, E. A., & Murer, P. F. Rate of information 
transfer with seven symbolic visual codes: Motor 
and verbal responses. USAF WADC tech. Rep. 
1956, No. 56-226. 2 

Bercer, C. Stroke-width, form and horizontal spac- 
ing of numerals as determinants of the threshold 
of recognition. Part I. J. appl. Psychol., 1944, 28 
208-231. (a) : , 


Charles L. Hughes 


Bercer, C. Stroke-width, form and horizontal spac- 
ing of numerals as determinants of the threshold 
of recognition. Part II. J. appl. Psychol, 1944, 28, 
336-346. (b) 

Brown, F. R. A study of the requirements for let- 
ters, numbers, and markings to be used on trans- 
illuminated aircraft panels: IV. Legibility of uni- 
form stroke capital letters as determined by size 
and height to width ratio and as compared to 
garamond bold. USN Air Material Cent. Reb» 
1953, No. TED NAM EL-609, Part 4. 

Brown, F. R., & Lowery, E. A. A study of the re- 
quirements for letters, numbers and markings tO 
be used on trans-illuminated aircraít control pana 
els: I. The effect of stroke width upon the legibil- 
ity of capital letters. USN Air Material Cent. Reb» 
1949, No. TED NAM EL-609, Part 1. 

Brown, F. R., Lowery, E. A, & WILLIS, M. P. 5 
study of the requirements for letters, numbers, a” 
markings to be used on transilluminated aircra 
control panels: III. The effect of stroke width an 
form upon the legibility of numerals. USN a 
Material Cent, Rep., 1951, No. TED NAM EL-607: 


Part 3. T 
CHAPANIS, A, Garner, W. R, & MORGAN, c d 
Applied experimental psychology. New Yo 


Wiley, 1949, ing 

Conex, J., & Wess, I. B. An experiment on codin 
of numerals for tape presentation. USAF WAD 
lech. Rep., 1953, No. 54-86. e. 

Craik, K. J. W. Instrument lighting for night bo 
Air Ministry Flying Personnel Res. Commit 
Rep., Lond., 1941, No. 342. 

Forges, T. W., & Hormes, R. S. Legibility 
of highway destination signs in relation tO proc: 
height, letter width, and reflectorization. 
Highway Res. Bd., 1939, 19, 321-335. mer- 

Kuntz, J. E, & Steicut, R. B. Legibility of nu oke: 
als: The optimal ratio of height to width of str 
Amer, J. Psychol., 1950, 63, 567-575. 

LawspELL, H. Effects of form on the 
numbers. Canad. J. Psychol., 1954, 8, 

Loucks, R. B. Legibility of aircraft 3 
dials: A further investigation of the relative 
bility of tachometer dials, USAF Sch. a 

block ees: 
and numbers. Air Ministry Flying Personnel. 
Committee Rep., Cambridge U., 1944, ON 

McLavcutin, S. C. Configuration and stro? thesis 
in numeral legibility. Unpublished master 
Tufts College, 1948. ibil f 

Sarto, H. B. Factors affecting the MAS act) 
digits. Amer. Psychologist, 1951, 6, 364. G strok? 

Soar, R. S. Height-width proportion 2» 1955 
width in numeral visibility. J. appl. P5Y€ di n. 
39, 43-46. (a d 

Soar, R. S. eke width, illumination level am abel 
ure-ground contrast in numeral visibility. al 
Psychol., 1955, 39, 429-432. (b) . pume” 

Soar, R. S. Numeral form as a variable 1n 

visibility. J. appl. Psychol, 


s 
jstance 
di 


legibility of 


71-19. 
instrument 
Jes 

ated 


2. 
1958, 42, 138-16 


61) 
(Early publication received June 16, 19 


Journal of Applied Psychology 
1961, Vol. AP o. 6, 569-35 


THE INTERACTION OF 


INFORMATION DISPLAYS 


WITH CONTROL SYSTEM DYNAMICS IN 
CONTINUOUS TRACKING' 


R. W. OBERMAYER, W. F. SWARTZ, ann F. A. MUCKLER 


Martin 


Perhaps the most widely used display modes 
in human tracking studies have been follow- 
ing and compensatory information displays. 
In the compensatory display mode, the op- 
erator attempts to align a moving index with 
a fixed command index. In this case, the mov- 
ing index is the sum of the operator's control 
movements, control system dynamics, and 
some arbitrary forcing function. In the follow- 
ing display technique, on the other hand, the 
forcing function is programed directly to the 
command index providing a moving profile of 
desired performance. The second moving in- 
dex then becomes directly the resultant of the 
Operator's control output. In both cases, of 
course, the objective is to null the error be- 
tween the two display indices. Experimental 
Comparisons of following and compensatory 
displays have been predominantly in favor 
of the following display (e.g. Poulton, 1950; 
Senders & Cruzen, 1952), presumably due to 
the additional information available to the 
Operator, 

However, since stimulus and response are 
clearly interdependent in a man-machine 

ĉedback control system, it is not dir dem 
to find instances where the effects on tad É 
‘ng performance of type of information is- 
Play and control system dynamics Ap 
leg., Chernikoff & Taylor, 1957). Theoreti 
“all i 1 that under certain 
e uu Oe f the fol- 

Ynamic conditions the advantages © ‘ 
Owing dis ight disappear. If a contro 
Syst 8 display mig e sexum m 
Dur wt includes integrators, t Ea die 
res, Nay continuously change p ihe apti 
ending continuous change ° 


1 ited 
Ti y the Uni 
State 33 (616) - 


tora, 


Air 


Patterson Air 


ight- 
= r Force Con- 


Se, Ohio, as a part of the Ai 
ay Integration Program. 


————————— Á————— s 


Company 


tor’s control force. Relieving the operator of 
the necessity of applying control force con- 
tinuously, vr in some time-sequenced pattern, 
has been termed *unburdening" by Birming- 
ham and Taylor (1954). Based on the as- 
sumption that the human operator wiil per- 
form best when his function is simplest, 
unburdening may result in a reduction of 
complexity in both display and control re- 
quirements. The compensatory display could 
be superior if the information contained in 
the instantaneous amplitude of the displayed 
error is sufficient to specify the response 
which reduces the error to zero. As Cherni- 
koff and Taylor (1957) state: “The S would 
not have to take account of target velocities 
or error velocities nor would he have to pre- 
dict in any fashion." Their data clearly sug- 
gest that the frequency of the couse forcing 
function and variations of control system dy- 
namics markedly affect the efficiency of com- 
pensatory and following tracking. 

To provide additional clarification, this 
study was undertaken to investigate the inter- 
action between modes of information display 
and control system dynamics with a low fre- 
quency simple sine wave. Both compensatory 
and pursuit modes of information display are 
compared where the operator is provided with 
direct control of successively higher deriva- 
tives of the control system output. 


METHOD 
Apparatus 


Display. The display elements were a horizontal 
line and an inverted T presented on a 5-inch Dumont 
Dual Beam CRT. The horizontal line was the com- 
mand index and was 2 inches long. The inverted T 
was the subject-controlled index. It consisted of a 
horizontal line 1 inch long with a vertical line, .5 
inch long, bisecting the horizontal component. Both 
symbols were centered on the CRT and moved only 
in the vertical dimension. — 

When using the following display mode, the in- 
verted T was controlled by the subject and the hori- 


369 


370 


zontal line was positioned by a course generator. The 
compensatory mode consisted of the horizontal line 
being used as a reference on the center of the screen, 
while the inverted T was driven by the difference 
between the course generator output and the con- 
trol system output. 

Control. A center-positioned joy stick was used, It 
was 1 inch in diameter and 22 inches long. It was 
possible to move the top of the control stick in an 
arc of plus or minus 6 inches from the stick’s null 
position by forward or rearward movements, re- 
spectively. The control stick was spring loaded so 
that it would return to within } of an inch of its 
null position. A breakout force of .5 pound was re- 
quired to displace the stick from its null position. 
The force-displacement gradient was linear and was 
1 pound per inch. An analog deadband circuit was 
incorporated into the experimental apparatus to cor- 
respond to plus or minus of an inch of stick dis- 
placement. 

A displacement of the control stick resulted in a 
proportional position, rate, or acceleration of the con- 
trolled index, i.e., control with no dynamics, one inte- 
grator, and two integrators, respectively. Two inches 
of stick movement resulted in 1 inch, 2 inches per 
second, or 8 inches per second squared of controlled 
index displacement for position, rate, and accelera- 
tion, respectively. All control system dynamics re- 
sulted in the same controlled index displacement at 
-5 second for a step function displacement of the con- 
trol stick at the control gains chosen. A forward dis- 
placement of the control stick forced the controlled 
index downward and a rearward displacement forced 
the controlled index upward, 

Course. A sine wave course, acting as a forcing 
function, was used in the experiment. The course had 
a frequency of 2-2/9 cpm, maximum amplitude of 2 
inches, and a period of 27 seconds. The course was 
replicated with fixed amplitude, frequency, and di- 
rection from zero for each experimental trial. 

Scoring. Measures of six dependent variables were 
obtained in the experimental situation: 

1. The Average Absolute Error (AAE), or inte- 
grated error, was the average difference in inches be- 
tween the command and controlled indices, without 
consideration of the sign of the differences, for each 
scoring period. 

2. The Root Mean Square Error (RMS) was the 
Square root of the integrated squared display error, 
in inches, which was obtained for each scoring period. 

3. The Time-on-Target (TOT) was the time, in 
seconds, that the controlled index was within a 
tolerance band of plus or minus 1/10 of an inch of 
the command index, 

4. Hits was the number of times that the error 
between the command index and the controlled index 
became less than plus or minus 1/10 of an inch. 

5. Average Absolute Control Movement (AACM) 
described the average distance in inches along the 


arc of the control stick handle that the control stick 
was moved for each scoring period. 


R. W. Obermayer, W. F. Swartz, and F. A. Muckler 


6. The Control-Stick Count estimated the uui 
of times that the control stick was moved through 
the stick deadband of plus or minus 4 of an inch. 


Subjects 


Nine adult males served as subjects in the experi- 
ment. They were selected from a broad population 
of professional pilots and engineers. In addition, the 
subjects were college graduates and varied between 
25 and 40 years of age. f 

The treatment X treatment X subjects model 3 
the analysis of variance was used (Lindquist, 1956, 
D. 237). There were six treatment combinations a5 
a result of two levels of display modes and three 
levels of control dynamics. This design was used to 
increase the precision of the experiment by partition- 
ing the variance due to intersubject differences. An 
interesting by-product of this design was the pos- 
sible examination of the interaction of subject Pe! 
formance levels with each of the treatments. 


Procedure 


Each subject was placed in a completely enclosed 
mockup of a fighter-type cockpit facing the the 
display which was approximately 28 inches from " 
subject's eyes, A low intensity white light was plac 
inside the mockup to provide ambient illuminato" 
The brightness of the light, fixed throughout the € 
periment, was sufficient to illuminate the pancl. 

An instruction and practice period was held 
to administration of the experimental treatments e 
cach subject. The subjects were thoroughly in 
as to the nature of following and compen 
modes of information display and position, e 
acceleration. control system dynamics. The ne tion 
consisted of presenting each treatment combina m 
of modes and dynamics to the subject for two 
The sine wave course was used. 

The experiment consisted of randomly Jes an 
each of the six treatment combinations of moc » were 
dynamics to each of the nine subjects. There com- 
5-minute rest periods between each treatment ation 
bination. Ten trials of cach treatment come 
were administered to each subject. Each ue pe- 
minute in length with a 30-second rest pat auto- 
tween trials. A yellow light on the panel WS e 
matically lighted to warn the subject that 10 i pur- 
remained before the beginning of the next ti 51 
ing the 30-second rest period, trial scores for experi" 
the dependent variables were obtained at p sect 10 
menter’s console. The overall time for the sub) 
complete the task was approximately 2 hours: 


prior 
ts for 


presenting 


RESULTS 
Performance Scores e ob 
er 

A total of 540 performance scores v each 
tained for each measure. Ten trials 1" the 
treatment combination were presente trials of 
subject; however, only the last five xamin?" 
each combination were retained for eX 


n ————————— — HO 


D 


eo ——————— m 


Information Displays in Continuous T. racking 


tion in order that stable data might be ob- 
tained from each subject. Retaining only half 
the total observations resulted in each meas- 
ure being based upon 270 scores. 

Figure 1 presents the mean tracking per- 
formance data for the AAE and RMS meas- 
ures; each mean was based upon 45 trials. 
The treatments are ranked along the abcissa 
of the figure in ascending order of treatment 
mean error magnitude from left to right. In 
Figure 1, a decrement of performance is noted 
for each change of control system dynamics 
to a higher derivative. Further, performance 
with following mode ranked better than com- 
pensatory mode for position control in both 
of the measures. However, this relation be- 
tween following and compensatory was re- 
versed for the rate and acceleration dynamics. 

Figure 2 presents the data for the TOT and 
Hits measures. TOT decreased in value as the 
AAE and RMS increased in value. An exami- 
nation of the Hits measure reveals that there 
Were fewer tolerance band crossings for the 
Tate control than for either the position or ac- 
Celeration control. It becomes apparent, then, 
Upon examining the TOT values, that the sub- 
jects tended to be on target for longer periods 
Per time for the rate control than for position 
Contro] even though better TOT performance 
Was obtained for the position control. When 
Compared to rate control, a relatively higher 
‘umber of hits for the acceleration control 


UBO 9 
ü Q-———O AME ri 
= D 
g O---O RMS 2 
n 
ul 
F1 
$ 
8 
^ Quo 
z 
* 
| 


TREATMENT 


e data of each 
es, (F = fol- 
P = posi- 


Ner d. Mean tracking performance a 
loi Ment for the AAE and RMS measu 


display. (— satory dis| 

ay, C = compensatory A ” 

tro Ponte] 7 C rol, A = acceleration con 
OL) 9l, R = rate control, £ 


371 


15540 
z 
o 
Wo E 
Eizo 500 
6 ' 
5 8 
É s 460 
=z $8 
[NND 
< ROB 420 
Br E 
= è 
< 
H 380 f- 
40 340 
FP CP CR FR CA FA 
TREATMENT 
Fic. 2. Mean tracking performance data of cach 


treatment for the TOT and Hits measures. 


did not result in better TOT 
however. 

The Control-Stick Count measured the num- 
ber of times that the control stick was moved 
through the null position; the results are 
shown in Figure 3. As the derivative of con- 
trol is increased, Control-Stick Count de- 
creases slightly, then increases considerably 
with acceleration control dynamics. It would 
appear from these data that the subjects re- 
sponse was not in the form of discrete posi- 
tioning movements, for if this were the case 
Control-Stick Count would increase as the 
derivative of control dynamics was, increased. 
Also, while it can be computed that the sub- 
jects could achieve zero error by employing 
a sinusoidal response, this too was not the 
case since otherwise no differences in Con- 
trol-Stick Count would result. While no spe- 
cific measure of the form of the subject's 
response was taken, it was observed that 
with acceleration dynamics the response of 
many subjects took the form of a high fre- 
quency, low amplitude signal (frequently 
termed “dither”) superimposed on a low fre- 
quency, high amplitude signal. 

The AACM, as shown in Figure 3, de- 
creased as a direct function of increasing 
derivative of control system dynamics. Con- 
siderably less control stick movement resulted 
with rate and acceleration dynamics than with 
position control dynamics, There was very 
little difference in AACM for the two modes 
of information display. 


performance, 


372 


t 
g 
& 


300 


Q---Q CONTROL- E 
STICK COUNT / 


250 


m O— O AACM o 


180 


MEAN CONTROL-STICK COUNT SCORES 
MEAN AACM SCORES - INCHES 


FP CP CR FR CA FA 
TREATMENT 


Fic. 3. Mean tracking performance data of cach 
treatment. for the AACM and Control-Stick Count 
measures. 


Statistical Analysis 


Analysis of Variance Results. 'The measures 
of AAE, RMS, and TOT were statistically ex- 
amined by means of the treatment X treat- 
ment X subjects model of the analysis of 
variance (Lindquist, 1953). Prior to analysis, 
the data for each of the three measures had 
to be transformed to satisfy the assumptions 
of the analysis of variance. The appropriate 
transform for each measure was determined 
on the basis of the normality plot for the re- 
spective distribution. It was determined that 
the AAE and RMS measures should be trans- 
formed logarithmically (log AAE and log 
RMS) and that the TOT measure should re- 
ceive a tangent transformation tan 1.5 (TOT- 


TABLE 1 
SuMMARY OF THE ANALYSIS OF VARIANCE RESULTS FOR 
THE Measures or AAE, RMS, ann TOT 


Significance level 


Measure 

Source AAE RMS TOT 
Modes 5%* ns ns 
Dynamics 1% 1% 1%** 
Subjects 1% 1% 1%* 
MXD 1% 5% 1% 
MXS ns ns 5% 
DXS 1% 1% 1% 


* Heterogeneous at 5% level as determined by the method 
of Oden and Olds Eres ER y 

leterogeneous at 1% level as determined by the method. 
of Odeh and Olds (1959). » 


R. W. Obermayer, W. F. Swartz, and F. A. Muckler 


TABLE 2 
RESULTS OF THE MULTIPLE RANGE TEST FOR TBE M 
X D INTERACTION or THE AAE, RMS, Axp TOT 
MEASURES 


Measure Range comparisons 

AAE FP CP CR FR CA FA 
RMS FP CP CR FR CA’ FA 
TOT FP FR CA FA 


CP :CK 


" " " — " he 
Note.—The underline denotes no significant difference at ti 
5% level. 


10). Upon completion of the transformations; 
homogeneity of variance for each measure Was 
examined (Odeh & Olds, 1959). For each 
used in the analysis of variance there is 25 
sociated an s?. Using the procedure establishe 
by Odeh and Olds, an identical analysis ° 
variance was conducted on the quantity log 
s*. By using this method, the distribution © 
each source of variance was analyzed seP” 
rately with respect to heterogeneity. The si 
sults of this analysis are presented in Table 
in conjunction with the summary of the analy- 
sis of variance results. " 
In Table 1, the most interesting results ae 
displayed in the Modes x Dynamics inter?" 
tion. It is noted that the significant ni^. 
action, whose distribution was homogene? 
at the 5% level, has been obtained in € 
of three measures, A significant Dynamics tr 
Subjects interaction was found, also, for 1e 
measures, but the Modes x Subjects inte ds 
tion was significant for only the TOT M 
ure. Unfortunately, it cannot be determ! des 
whether the relation between display ment 
and control dynamics was also depen p 
upon subject characteristics since the M 
X S interaction was nontestable. 2 prt 
Multiple Group Comparisons. Table tiple 
sents the results of Duncan’s new Bi i 
range test (Duncan, 1955) of the M X the 
teraction in summary form for each 9 the 
measures. All multiple comparisons 9 59 
treatment means were conducted at the dif- 
level. An underline denotes no significan! (ne 
ference between those treatment means ^ ifi- 
5% level; all other comparisons are si nked: 
cant. In addition, the treatments are ™ 


` 


e 


Information Displays in Continuous Tracking 


by the magnitude of their means, in ascend- 
ing order of error from left to right. It is 
noted that there was no difference in the sta- 
tistical outcomes in the RMS and TOT meas- 
ures and very little between the AAE and 
RMS-TOT measures. The treatment ranking 
Was the same for all three measures. 
Examining the following and compensatory 
modes for common conditions of position, 
rate, and acceleration dynamics in Table 2 
gives statistical bearing to the data contained 
in Figure 1. Compensatory was nonsignifi- 
cantly different from following mode for po- 
sition and rate control, but was significantly 
superior to following for acceleration control 
in each of the three measures. k 
Examining position, rate, and acceleration 
dynamics for common conditions of follow- 
ing and compensatory modes discloses that 
position, rate, and acceleration were each sig- 
nificantly different from one another for the 
following mode with position control being 
the best and acceleration the worst; however, 
only the acceleration control was significantly 
different from the position control for the 
compensatory mode. It is noted that these re- 
sults, too, were the same for all three measures. 
Individual Subject Comparisons. Table 3 
Shows the relationship of position, rate, and 
acceleration dynamics for each subject within 


TABLE 3 
: ND / ELERATION 
Comparison op Position, RATE, AND ACCELER 
ror Each SUBJECT 


DYNAMICS B. 
Range comparison 
Measure 
AAE RMS TOT 
1 RPA PRA P RA 
2 RPA PRA RP 
S PRA PRA PRA 
s PAR PAR P AR 
ihe a 
à PRA RPA RPA 
ce zx ERETI EES 
RPA Ap > 
2 RPA RPA RPA 
3 PRA PRA by 
1 PRA PRA PR 
: gcn Wndertine denotes no dlipilficant difference at the 


373 


the significant D X S interaction. Mean per- 
formance scores, when averaged over all sub- 
jects, indicated that a decrement in perform- 
ance occurred with changes from position to 
rate to acceleration control for each measure. 
This is demonstrated in Figure 1. However, 
Table 3 demonstrates that performance for 
each individual subject was not affected, in 
all cases, in the manner which is indicated by 
averaging over all subjects. The same results 
were not obtained for each of the measures. 


Discussion 


It is interesting in light of the abundance 
of research proclaiming the superiority of the 
following mode of information display by a 
factor of 2 to 1 or more, that in no condition 
of this experiment was the following display 
found to be superior to the compensatory dis- 
play. Both minimum and maximum tracking 
error in this study occurred with the follow- 
ing display. This is taken as further evidence 
that with low frequency courses the subject 
needs little more than simple error informa- 
tion provided by the compensatory display. 
Indeed, it is perhaps surprising that the re- 
sults of this study were not more favorable to 
the compensatory display, since it is suggested 
by Birmingham and Taylor (1954) that in 
cases where a simple stimulus-response rela- 
tionship exists, the additional information 
provided by the following display may con- 
tribute to degraded system performance. 

Using a complex course at about the same 
frequency as the simple sine wave in the 
current study (2-2/3 + 4-4/9 + 6-2/3 cpm), 
Chernikoff and Taylor (1957) found rate 
control to be superior to position control with 
both modes of information display. In the 
current study no reliable difference between 
position and rate dynamics was found with 
the compensatory display, and rate control 
was significantly inferior with the following 
display. One would expect with a regular and 
highly predictable sine wave course that the 
benefits of unburdening would be more promi- 
nent than with a complex course. A possibly 
critical difference between the two studies was 
that the average course velocity in the cur- 
rent study was higher (Bowen & Chernikoff, 
1958). It would appear that there is reason 
to suspect that course variables exist, in ad- 


374 


dition to course írequency, which interact 
with modes of information display and con- 
trol system dynamics. 

It can be computed that if the output of 
each subject were sinusoidal and of the same 
frequency as the course, zero tracking error 
could result for the course used in this study. 
Unfortunately, there were no data collected 
which would directly specify the form of the 
subjects response; however, it can be inferred 
that the subject's response was not totally in 
the form of simple corrective movements and 
since tracking error was quite small, it may 
be assumed that a sinusoidal response was 
closely approximated. While the form of the 
desired subject response is sinusoidal for each 
order of derivative control, the phase relation- 
ship between the desired control movement 
and the sinusoidal course movement differ 
with control system dynamics. For accelera- 
tion control dynamics, the desired subject’s 
control movements are of the same sinusoidal 
form but in exactly the opposite direction 
from that which would achieve zero error with 
position control dynamics (Ely, Bowen, & 
Orlansky, 1957, p. 15). The “natural” con- 
trol sensing was used for position control dy- 
namics; therefore, the control sensing for ac- 
celeration control dynamics may be termed 
the “unnatural” sensing. Green, Norris, and 
Spragg (1955) suggest that due to the fact 
in compensatory tracking the resulting move- 
ment of the target is the result both of the 
subject’s input and the course generating sys- 
tem, the subject does not get a clear feed- 
back of the results of his own inputs, and 
therefore, these responses may be less subject 
to influence by changing the display-control 
relationships. The following display does pro- 
vide separate display of the results of the 
subject’s responses; presumably, the subject 
would be well informed of the effect of his re- 
sponses, and, in addition, could compare his 
stick motion to the motion of the course 
symbol. It is, therefore, possible that the su- 
periority of the compensatory display with ac- 
celeration control dynamics may be in part 
due to the existence of an unnatural control- 
display relationship with these dynamics. 

Since there are few identifying character- 
istics of the subjects of this experiment, it is 
futile to attempt to correlate the different 


R. W. Obermayer, W. F. Swartz, and F. A. Muckler 


trends in performance found among indi- 
viduals to specific subject characteristics. 
However, that the results of tracking studies 
may be dependent on the particular indi- 
viduals employed is not a new finding. Wals- 
ton and Warren (1954) in attempting to meas- 
ure a mathematical transfer function for each 
subject found that quite different tracking 
techniques were used by some individuals. 
Muckler (1960), for example, in investigat- 
ing tracking performance with oscillatory 
transients, showed that trends in tue pen 
formance data depended on the individua 
subjects and that even reversals in the over- 
all trend may result. He concluded that the 
performance scores obtained were quite de- 
pendent on the control techniques used by the 
individual subjects and that the task tech- 
niques must be considered in the interpreta- 
tion of the data. In the present study, three 
different rankings of dynamics were foun’ 
among nine subjects, constituting six differen 
statistical outcomes. It is clear, then, that the 
data of this study might have been chang 
considerably by judicious sampling of D 
nine subjects. 

A control system gain was arbitrarily chosei 
for each condition of control system dynam 
used in this study. While 2 inches of ae 
movement produced display changes of 1 m i 
2 inches per second, or 8 inches per en 
squared for position, rate, and Ae s 
dynamics, respectively, a wide range of ept 
parameters might have been equally me 
able for the purposes of this study. The cor 
ditions of dynamics used, then, are psit rs 
sidered as control gain-dynamics com a 
tions. There is no guarantee that the ae i 
gain-dynamics used were optimum; display 
the optimum is likely a function of | that 
mode. It is entirely possible, therefore, re- 
the results of this experiment are not E m- 
sentative of all control gain-dynamics the 
binations. It is suspected that in pat rep“ 
considerable reduction in AACM scores jeve 
resents an attempt by the subjects to m ain; 
System stability by reducing his overa move” 
however, since the amount of control gain 
ment is also dependent upon the contro je of 
associated with each control dynamic, li e Ut 
a concrete nature can be stated here. Th 


itio” 
avoidable conclusion is that only add 


- 


Information Displays in Continuous Tracking 375 


research can resolve the influence of control 
gain in this study. 


SUMMARY 


Nine skilled subjects performed single di- 
mension tracking with all combinations of 
following and compensatory modes of infor- 
mation display and position, rate, and ac- 
celeration control system dynamics. The forc- 
ing function was a slow frequency, simple sine 
wave course. Tracking performance was found 
to be dependent upon the interaction between 
display modes and control system dynamics. 
Compensatory tracking was equivalent to fol- 
lowing tracking for position and rate control, 
but was significantly superior to following 
tracking for acceleration control. For the com- 
pensatory mode, position and rate control 
tracking performance was statistically equiva- 
lent; however, both position and rate control 
were significantly superior to acceleration con- 
trol. For the following mode, all dynamics 
conditions were significantly different with po- 
sition control demonstrating highest and ac- 
Celeration control lowest performance levels. 
Examination of significant subject interac- 
lions with treatments showed deviation in in- 
dividual cases from the averaged group data. 


REFERENCES 


& Tavrom, F. V. A human en- 
design of man-operated 
Lab. Rep, 


Branouans, H: Py : 
Sineering approach to the 
Continuous control systems. USN Res. 


1954, No, 4333. 


Bowen, J. H, & Cuernixorr, R. The effects of 
magnification and average course velocity on com- 
pensatory tracking. USN Res. Lab. Rep., 1958, No. 
5186. 

Cuernixorr, R., & Taytor, F. V. Effects of course 
frequency and aided time constant on pursuit and 
compensatory tracking. J. exp. Psychol., 1957, 53, 
285-292. 

Duncan, D. B. Multiple range and multiple F tests. 
Biometrics, 1955, 11, 1-42. 

Ery, J. H., Bowen, H. M, & Ortansky, J. Man- 
machine dynamics. USAF WADC tech. Rep., 1957, 
No. TR 57-582. 

Green, R. F., Norris, E. B. & Spracc, Sumrtey D. 
Compensatory tracking performance (modified 
SAM two-hand pursuit test) as a function of the 
directions and planes of movement of the control 
cranks relative to the movement of the target. J. 
Psychol., 1955, 40, 411-420. 

LixpQuisT, E. F. Design and analysis of experiments 
in psychology and education. Boston: Houghton 
Mifflin, 1953. 

Mvckrzm, F. A. Man-machine tracking performance 
with short period oscillatory control system tran- 
sients. USAF WADD tech. Rep., 1960, No. TR 
60-3. 

Open, R. E. & Orps, E. G. Notes on the analysis of 
variance of logarithms of variances. USAF WADC 
tech. Note, 1959, No. TN 59-82. 

Pourrow, E. C. Two pointer and one pointer dis- 
plays in tracking. Med. Res. Council Rep., 1950, 
No. APRU-150/50. 

Senpers, J. W., & CnvzEN, Marianne. Tracking per- 
formance in compensatory and pursuit tasks. 
USAF WADC tech. Rep., 1952, No. TR 52-39. 

Watston, C. E, & Warren, E. C. A mathematical 
analysis of the human operator in a closed-loop 
control system. USAF Personnel Train. Res. Cent, 
tech, Rep., 1954, No. TR 54-96. 


(Early publication received July 20, 1961) 


Journal of Applied Psychology 
1961, Vol. 45, No. 6, 376-380 


CROSS-VALIDITY OF PROCEDURES FOR SELECTING 
LIFE INSURANCE SALESMEN' 


PETER F. MERE? 


DA,? WALTER V. CLARKE, axo CHARLES E. HALL * 


Walter V. Clarke Associates 


An extensive study of personality and bio- 
graphical factors as predictors of success-fail- 
ure of life insurance agents has shown that 
both personality characteristics (Merenda & 
Clarke, 1959a) and personal history variables 
(Merenda & Clarke, 1959b) are related to 
success in selling ordinary life insurance. The 
previous findings have further demonstrated 
that the predictive efficiency of selecting life 
insurance salesmen can be enhanced by the 
combined use of both sets of predictor vari- 
ables. 

An earlier study of Clarke (1956b) using 
the same personality assessment instrument 
employed in this study revealed evidence of 
discriminatory power of the personality vari- 
ables in distinguishing between high and low 
producers among life insurance agents be- 
yond 2 years after hire, Another parallel study 
by Wallace, Clarke, and Dry (1956) failed 
to show the same significant discriminations 
between first and second year successful and 
unsuccessful agents. 

The study presented in this paper reports 
on the cross-validity of the prediction system 
used in the earlier validity study (Merenda 
& Clarke, 1959b) administered to an inde- 
pendent sample of life insurance agents. The 
main intent of the study was to determine 
whether the power of prediction found in the 
validity study (Merenda & Clarke, 1959b) 
would hold up on cross-validation. A further 
purpose was to obtain data which would help 
clarify the apparent discrepancies found in 
the earlier studies of Clarke (1956b) and 
Wallace, Clarke, and Dry (1956). 


1 The major portions of this paper were presented 
at the Annual Meetings of the Midwestern Psycho- 
logical Association, St. Louis, Missouri, April 29, 
1960. 

? Now at the University of Rhode Island. 

? Now at the Rhode Island State Department of 
Education. 


METHOD 


Subjects. Subjects in this cross-validation study 
were 535 financed male life insurance agents hire 
by an ordinary life insurance company between 
January 1, 1955 and July 31, 1956. The carlier vali- 
dation sample was comprised of 322 agents hired py 
this same company in the period between Septembe 
1, 1950 and December 31, 1954. as 

Predictors. Three sets of predictors of success i 
life insurance salesmen were employed with the V2 i 
dation sample. The first was a self-concept person- 
ality assessment inventory. The particular ipe 
ment was the Activity Vector Analysis (Cae 
1956a) or AVA, as it is most commonly called. ^ 
AVA was administered at the time of application ^ 
the position of salesman. The second set was b 
posed of five personal-social history items conta 
in the application blank completed by the sales 
prior to hire. A third predictor of age of the 387 
at the time of application for the position of life d 
surance salesman was used in the earlier validati 
study. 

Criteria. Two sets of criterion standards were 

The first criterion was a Success-Failure dicho 
determined by combining dollar production with do 
status within the company at the end of a 3 
period after hire. 


used. 
tomy 


A successful agent (N = 97) is one who: 

1. Meets his Training Allowance Prog ; 
or achieves $200,000 production in his first. ye ae 
at least $300,000 in either his second or third ened 

2. Is advanced to a supervisory or manage 
position within the company; or 

3. Leaves the company to become an ài 
pervisor, or general agent of another comp t 
fore the end of the third year if he achieves thi 
duction goals outlined above. 


m quot 
ear an 


gent, 2 
any Do 


An unsuccessful agent (W = 438) is one who: anti 
outli 


1. Fails to reach the production Beals n com^ 
whether or not he remains as an agent with 
pany; or è com 
2. Has had his contract terminated by th 
é ar 
pany; or b os 34-yet 
3. Leaves the insurance industry within ? 
period. ne 
:ness volu 
The second criterion was the new ham the 
(face value of insurance policies) at the € i 
first, second, and third years. „crimini 
Procedure. In the validity study, disc rovidin? 
analysis was applied to the problem of P di- 


ij E accents 
maximum the 522 age 


376 


separation between 


p^ 


Selecting Life Insurance Salesmen 377 


TABLE 1 


RELATIVE PREDICTIVE EFFICIENCIES oF AVA, PERsoNaL History, AND AVA plus PH 
FoR VALIDITY AND CROSS-VALIDITY SAMPLES 


Percentage 
rejected 
rs by these 

Number of agents failing to meet standards standards 
AVA alone PH alone AVA + PH 3- 3- 

> year year 
Sample N S U S U S U S U 
Validity 2 39 17 134 19 156 7 38 
Cross-validity 6 60 4 73 10 121 0 28 


chotomized on the first criterion into 108 successful 
and 414 unsuccessful salesmen. Separate analyses 
Were made for the personality, personal history, and 
age predictors. The discriminant functions for each 
predictor set proved to be statistically significant. 
With respect to the age predictor, it was found in 
the validity study that applicants who were younger 
than 25 or more than 45 years old proved to be 
Substantially less successful than those who were 
Within this age group. s 

The AE, P aans derived on the validity 
Sample were applied to the 535 subjects in the cross- 
Validity sample. Applying the first criterion standard 
0 the sample of 535, the group was dichotomized 
into 97 successful and 438 unsuccessful agents. Em- 

Oying the same cut-offs indicated in the validation 
Study as being the optimal discriminant scores for 
"electing the greatest proportion of potentially y^ 
Uccessful agents at the sacrifice of a minimum 1 
Stentially successful salesmen, predictions were made 
vate. the success-failure of subjects in the pe 
Vlidity sample. Also analyses were made of t e T : 
aronship existing between the ae Wei mue 

tes on (he predictor sets and agen 
jeg bed = the personality and personal 
latory variates. These cross-validities. posu a 
ined against the single criterion of paid produce 
“pe i ization on the success-fa 
Ure dently of dichotomizat! mo 
Dre, jc erion. The perpe a fis auth OF 
the “lor, however, were not upati ^ s 
Me ‘Toss-validity sample. Hence m ver dation 
Drog from the predictor set in the ci 
‘Ure reported in this paper. 


RrsurTS AND DISCUSSION F 
o 
ION linear discriminant scores iE al 
8p AVA profiles yielded a mean of 12-5 9 
Mean 8.43 for the successful group: ©” 
Reef Of 9.08 and SD of 8-55 for the viai 
se Sfu tal cross-validity 
mpy Stoup. For the to 


1 E ed 
tom ^ the AVA discriminant an 
ist. 7 10 inear composite si 

trip, 0o +33. The lir history variates 


lons of the personal 


yielded a mean of 108.64 and SD of 10.56 for 
the successful group, and a mean of 100.03 
and SD of 9.80 for the unsuccessful group. 
For the total sample, the personal history dis- 
criminant scores ranged from 70 to 134. 

Table 1 presents the results obtained in the 
cross-validation of the predictive efficiency of 
the predictor sets, employing multiple cut- 
offs. Any subject was considered to be a po- 
tential failure who failed to achieve at least 
a discriminant score of zero on the AVA set 
and a score of 91 on the personal history set 
of predictors. 

It is noted in Table 1 that the validity of 
prediction of failure of a significantly great 
proportion of life insurance salesmen who 
were actually unsuccessful at the end of 3 
vears in the validity sample is upheld by the 
data of the cross-validity sample. While some 
shrinkage is noted in the percentage of un- 
successful agents who failed to meet the stand- 
ards, a gain arose in the power of prediction 
of successful agents. A drop of 7 percentage 
points (from 17% to 10%) was noted in the 
proportion of successful agents who failed to 
meet the standards. 

Tt is further noted in Table 1 that the 
ratio of unsuccessful salesmen to successful 
agents was greater for the cross-validity sam- 

le than in the validity sample. In the earlier 
sample, 108 of the 522 agents were considered 
to be successful at the end of 3 years, whereas 
in the later sample only 97 of 535 were suc- 
cessful in terms of the criteria of the study. 
Two other significantly different features were 
noted: the cross-validity sample contained 
more agents than the former sample who 


378 


TABLE 2 
First YEAR PRODUCTION YIELD OF PRODUCTIVE AGENTS 
BY CATEGORIES OF THE AVA DISCRIMINANT SCORE 
Score 
range* Nv Median Mean [4 
250- 349 26 250.00 250.00 177.75 
150- 249 102 156.25 201.95 173.75 
50- 149 182 172.2 203.00 160.40 
—49- 49 96 130.77 160.95 135.15 
—149- —50 19 116.67 127.65 105.70 


Note.—The data of this table are based on only those agents 
who were still employed at the end of the first year. 

a The original discriminant scores were multiplied by the con- 
stant 10 in order to eliminate the use of decimals in the values 
indicated in the score range. 

b Paid production in thousands of dollars. 


failed to pass the personality criterion, and 
fewer who failed to pass the personal history 
criterion. 

Although fewer of the agents failed to pass 
the selection screen than was the case with 
the validity sample, a total of 121 unsuccess- 
ful agents in the cross-validity sample would 
have been rejected at the sacrifice of only 10 
who proved to be 3-year successes. Over the 
3-year period these 121 unsuccessful agents 
sold a yearly average of $40,730 life insurance 
(face value). The 10 successful agents sold a 
yearly average of $340,000. The costs to the 
company in the initial recruitment, training, 
and financial support of these financed agents 
constituted a far greater investment than the 
3-year return from this group of 131 sales- 
men who would not have been hired had the 


TABLE 3 
SEcoNp YEAR PRODUCTION YIELD OF PRODUCTIVE 
AGENTS BY CATEGORIES OF THE AVA DISCRIMINANT 


Score 
Score 

range" N Median Mean c 
349 16 300.00 38440 326.05 
249 50 288.80 333.00 23245 
149 90 288.89 287.20 21235 
—49- 49 40 200.0 263.75 217.80 
—149- —50 5 200.00 174.50 134.15 


a of this table are base se age 

wiare al c oyed at ible age pared on only those agents 
* The original discriminant scores were multiplied by the con- 

stant 10 in order to eliminate the use of decimals in the values 

indicated in the score range. m 
b Paid production in thousands of dollars. 


P. F. Merenda, W. V. Clarke, and C. E. Hall 


selection system been in effect at the time of 
hire. 

A further criterion in the form of paid pro- 
duction (face value of life insurance sold 
was applied in the study. 


Personality Variables 


Tables 2-4 present the median and mean 
productions of agents who were actively en- 
gaged in selling insurance at the end of the 
first, second, and third years. In these tables; 
the sample subjects are classified according to 
AVA discriminant score categories regardless 
of classification on the success-failure criteria 
of the study. 

The data of Tables 2, 3, and 4 reveal that 
there is a high relationship between scores 0” 


TABLE 4 
Turd YEAR PRODUCTION 
AGENTS BY CATEGORIES OF TI 


JE 
YIELD OF PRODUCTIVE 
3 AVA DISCRIMINAN 


= ——— TEMPIO amo E MBs 
Score 
range" N Median? Meant v E 
250- 349 8 416.67 200.00 
150- 249 35 29286 350.70 21725 
50- 149 59 291.67 336.00 337.00 
—49- 49: 27 22500 243.50 dp 
—149- —30 3 22500 20835 623 
hose agent? 


Note.—The data of this table are based on only 
i i at the end of the third, 
The original dis nant scores were multipli 
stant 10 in order to eliminate the use of decimals ir 
indicated in the score range. 

» Paid production in thousands of dollars. 


" gent 
tained on the personality profiles and ag 


production at the end of each of 3 years. ^ o, 
the median and mean production Sue 
agents who were actively engaged in 
life insurance at the end of 1, 2, and S 
respectively, appear to vary directly W! n the 
category of discriminant score ranges S that 
AVA. It is further noted in these tables o oF 
those agents with discriminant scores 0 ^ an 
above on the AVA sold, on the ee "ap" 
amount of insurance which either close y ple 
proached or exceeded the minimum ad 
standards of production set by the e wert 
agency department. These standards „pd 
$200,000 at the end of the first a or 
$300,000 at the end of either the sec? hose 
third year. The data further show that 


gelliné 
year? 


Selecting Life Insurance Salesmen 379 


agents with AVA profiles which were closest 
to those possessed by the successful category 
of agents in the validity study sold, on the 
average, substantially more life insurance 
than those whose personality profiles (as re- 
vealed by relatively low or negative discrimi- 
nant scores) were incompatible with the suc- 
. Cessful agents in the validity sample. 


Personal History Variables 


The cross-validity of the personal history 
predictor set is demonstrated by the data of 
Tables 5, 6, and 7. Again, as with the cross- 

' validity findings on the personality factors, a 
near perfect relationship was found to exist 
between the median/mean productions and 
Category classification with respect to dis- 


TABLE 5 
First Year PRODUCTION VIELD oF PRODUCTIVE 
AGENTS BY CATEGORIES OF THE PERSONAL HISTORY 
DISCRIMINANT SCORE 


S 
Score range N Median Mean* o 
ee Bee re Zn 
J 5 217.28 
120 and above 21 208.34 23215 2 
110-119 58 228.57 258.62 184.86 
100-109 153 102.50 193.30 
90- 99 141 143.33 178.90 
89 and below 52 111.11 138.46 
* Paid production in thousands of dollars. 


‘timinant score range on the personal history 
"t of variates. Those with the lowest dis- 
"hinant scores tended to sell, on the ed 

wie Considerably less life insurance than us 

di high scores. It is further noted from 
» a of these tables that the cut-off score 


fa Which was indicated by the results pies 
to lier Study to be an optimal criterion a fa 
Th © confirmed by the eross:yalidity fop ae 
tg? agents with scores below agen 
amg 9n the average, less than the 


Yet of life insurance in each Bs be 
| 2 oF Also, by the end of the third y OO 
e original 52 agents in this categot3 
Still employed by the company. 


Summary anp CONCLUSIONS 

35 ordinary life 
cross-validate a 
lidated on an 


DET 
, stan dependent sample of 5 
"etj, © Agents was used to 
System originally V 


TABLE 6 
SEcoNp YEAR PRODUCTION YIELD or PRODUCTIVE 
AGENTS BY CATEGORIES OF THE PERSONAL HisTORY 
DISCRIMINANT SCORE 


Score range N Median? Mean? [4 
120 and above 12 533.34 — 360.46 
110-119 39 323.72 222.89 
100-109 70 301.43 224.52 

90-99 61 272.54 202.33 

89 and below 19 180.27 173.88 


* Paid production in thousands of dollars, 


earlier sample of 522 agents from the same 
company. The predictor sets were: (a) a 
personality profile measured by the Activity 
Vector Analysis and (5) a set of five per- 
sonal history variables obtained in a weighted 
application blank. The criteria were: (a) 3- 
year agent success-failure determined on a 
multiple criterion basis incorporating agent 
production and company status and (5) dol- 
lar amount of face value of life insurance sold. 
The predictions were made on the basis of 
multiple cut-offs, ie. any agent failing to 
pass either of the critical scores on the two 
predictor sets was identified as a potential 
failure. These cut-off scores were determined 
from the discriminant analysis results from 
the validity sample of 522 agents. 

The data of this cross-validity study con- 
firm the findings obtained in the original vali- 
dation study. These are: 

1. The AVA is a valid predictor of success- 
failure among life insurance agents. 

2. Certain personal history measures are 
valid predictors among life insurance agents. 


TABLE 7 

Tutrp YEAR PRODUCTION YIELD OF PRODUCTIVE 
AGENTS BY CATEGORIES OF THE PERSONAL History 
DISCRIMINANT SCORE 


Score range 


120 and above 12 383.33 604.17 
110-119 28 275.00 301.79 
100-109 48 275.00 290.63 
90- 99 37 285. 307.43 

7 175.00 239.9 


89 and below 


a Paid production in thousands of dollars. 


380 


The predictive validity of one of the measures 
(age) was not, however, upheld. 

3. Combining AVA and personal history 
data enhances the predictive efficiency of 
these measures in determining success or fail- 
ure of life insurance agents over a sustained 
period of time. 


REFERENCES 


CLARKE, W. V. The construction of an industrial se- 
lection personality test. J. Psychol., 1956, 41, 379- 
394. (a) 


P. F. Merenda, W. V. Clarke, and C. E. Hall 


Crankr, W. V. The personality profiles of life insur- 
ance agents. J. Psychol., 1956, 42, 295-302. (b) 
MerexNDa, P. F., & CrarKe, W. V. Activity Vector 
Analysis validity for life insurance salesmen. Engng. 

industr. Psychol., 1959, 1, 1-11. (a) E. 

Merenpa, P. F, & CrarKe, W. V. The predictive 
efficiency of temperament characteristics and per- 
sonal history variables in determining success 0) 
life insurance agents. J. appl. Psychol, 1959, 43, 
360-365. (b) 

Watrace, S. Ry Crarke, W. V. & Dry, R. J. The 
Activity Vector Analysis as a selector of life in- 
surance salesmen. Personnel Psychol., 1956, 9, 337- 
345. 

(Received January 1, 1960) 


Journal 


1961. Ur Applied Psychology 


ol. 45, No. 6, 381-387 


EXPERIMENTAL EVALUATION OF BINARY CODES 
FOR CONSOLE DISPLAY 


FRANK J. MINOR anb STANLEY L. REVESMAN 


International Business Machines Ci 


This study experimentally evaluated three 
binary code schemas in terms of operator cod- 
ing performance. The purpose of the evalua- 
tion was to aid in the selection of one of three 
alternative code schemas to be used on a data 

P processing System console display. Informa- 
= tion to be displayed in coded form would be 
a computer memory address displayed on an 
"address register display," and the contents 
of that address displayed on a "character 
register display." This information is required 
by the system programer to check the ac- 
Curacy of a series of data processing instruc- 
tions which he has assembled to serve as a 
data processing program. In installations of E 
Biven type such checking, or debugging, is 
ccomplished by executing via the system con- 
Sole one program instruction at a time in its 
Proper sequence. During this cycling process 
€ programer makes careful note of the pro- 
Sram instruction, the raw data, the data re- 
| Sults and their corresponding memory ad- 
"esses as displayed in coded form a the 
"bpropriate console character and a pon 
"éBisters This information is checked by the 
. ‘Togramer to locate errors of programing 
Which if left uncorrected would result in in- 
. "alid d; co 
eet MR logic of the system 
codes evalu- 
im amenable only to the three ^ 

i 1 uld not perm 

in this study, and wo 


i chemas 
we difications of the codes. The code s 
ere; 


) 


| 
| 
’ 


Wi 


1. Biquinary Code : 
2. Binary Coded Decimal 
3. Combined Code 2 
des 1 tly in use for dis- 
currently 
phe cessing systems. 
ation of the first 


TNI 
Code Purposes on data p 
Wi 1S a workable comb! 
h €S, but is not currently e rk 
or display purposes. pert omi 
Or which this system was in this 
Of training was not an Issue dE 
t is not expected that prog es 


9 co 


orporation, Endicott, New York 


trained for existing stored program data proc- 
essing systems would also perform program- 
ing for this new system. 

Of the three code schemas, the Binary 
Coded Decimal was the most efficient in terms 
of cost and electronic simplicity for this spe- 
cific system. The difference in efficiency how- 
ever was not sufficient to justify eliminating 
the other two schemas from consideration. 
This study was conducted to determine what 
performance differences could be expected in 
terms of programer’s performance at the con- 
sole as a function of these code variations, 


METHOD 
Dependent Variables 


The criterion for the codes evaluation was the pro- 
gramer’s efficiency measured in terms of speed and 
accuracy of coding console address and character 
registers. The speed with which a programer can per- 
form the coding task is critical since the data proc- 
essing system is nonproductive during the debugging 
procedures at the console. When the hourly rental 
cost of such a high speed data processing system is 
considered, any reduction in system downtime makes 
for more efficient utilization of the system. Fre- 
quently there is more than one programer assigned 
to a system. Each programer writes and debugs his 
own programs. 

The program debugging task at the console is not 
a primary task of any one programer, and is of an 
intermittent nature. However, total system down- 
time due to debugging by all programers assigned to 
the system could significantly reduce potential Sys- 
tem computing time. The frequency and time spent 
by any one programer interrogating a system for 
program debugging purposes will vary as a function 
of the number and complexity of new programs he 
is required to write, his ability, and his preferred 
work methods. Barring very short and simple pro- 
grams, newly written programs contain errors which 
require detection and correction. 


Experimental Conditions 


Each of the three possible code schemas facilitated 
the coding of 48 characters. A character could be dis- 
layed on the computer console by the illumination 
Pe or more of a configuration of neon bulbs in 
peche register. For the experimental conditions 


382 


y. 


BiQuiNARY 


Bit codes 


Bit codes 


1 


12 


| 2^6 omx2cnEF 


0 


N 0 


0 


11 


Frank J. Minor and Stanley L. Revesman 


nA AA 


ne nREWeN S | 


WA AA 


Bit codes 


0 2 


1 


oN 1 


11 


12 


XxX 
X 


I 
KAARM AM AMAIA | 
4monnmmuenas.n 


TABLE 2 


Bixagnv Copep DECA 


Bit codes 


A 8 421 


B 


e 


nannan 
mu wes d 
Anan d 


mamamennoot® 


G B 


A 


“wn 
AAAA A 
AALAN 


nannnnannnnn 
wA M 


-—uazzonox es 


| 
jn | 4^ we 
alt AAA 
ESEJ 
&|^ | anann 
| 
A d 
a|luuununn 
a | we we 
£|*| 
FEES 
lat 
[:] 
G: HA A 
ES 
"E 
E 
ge 
als 


emonuaozad D 


Binary Codes for Console Display 


the possible configurations of illuminated neon bulbs 
were simulated with paper and pencil coding tasks. 
The configuration of lights for cach code structure 
was simulated by printed circles on the test sheets. 
A description of each code condition is as follows: 

Code Condition 1 is an 11-bit (11 lights) Biquinary 
Code system of which 3 of the 11 lights are im- 
pulsed ‘to denote a character. The code system as 
shown in Table 1 would be the basis for the charac- 
ter register display; a modification of this code sys- 
tem would be the basis for the address register 
display. A 

Code Condition 2 is a 7-bit Binary Coded Decimal 
system of which a variable number of neon bulbs 
averaging 3.33 per character are impulsed to denote 
a character. The code system as shown in Table 2 
would be the basis for the character register dis- 
play; a modification of this code system would be 
the basis for the address register display. 

Code Condition 3 is a combination of Codes 1 
and 2 such that the character coding schema for 
Code 1 would be the basis for the character register 
display; the address register coding schema for Code 
2 would be the basis for the address register dis- 


play (see Tables 1 and 2). 


Design 

gn with repeated meas- 
s was utilized. The 
d under a series of 


A 3X3 experimental desi; 
urements of independent group: 
three code schemas were analyze 
three separate trials. 


Subjects 


Fifty-five naive subjects 
Were employed in the study. 


(33 males, 22 females) 
None of the subjects 


Fic. 1. Example 


s of simulated displa: 


383 


had had previous experience with nondecimal num- 
her systems or codes such as those of the experiment. 
Subjects were paid volunteers from a junior techni- 
cal college (Broome Technical Community College). 
All subjects were drawn from a population of stu- 
dents majoring in Business Technology. This par- 
ticular population was selected to increase the homo- 
geneity of subjects on task relevant variables (pri- 
marily clerical skills) and also to have subjects who 
were representative of the expected data processing 
system programers. 

Two of the independent groups consisted of 18 
subjects cach; the other group, 19 subjects. Schedul- 
ing limitations precluded a completely random as- 
signment of males and females across groups. Conse- 
quently, there was a disproportionate assignment of 
males and females to each group. A post facto analy- 
sis, however, supported the assumption of no sex 
differences for the coding task. To test for equality 
in general ability level between the groups, ¢ tests 
were conducted between groups based on the cumu- 
lative scholastic average of each subject. The results 
of the ¢ tests indicated no differences between groups. 


Apparatus 


Printed master code sheets which presented all let- 
ters of the alphabet, the numbers 0-9, and 12 special 
characters were made up for each code condition. 
The code sheets were used during training and test- 
ing of the subjects. 

Paper and pencil tests were utilized in the experi- 
ment. These were designed to simulate coding prob- 
lems which would occur on the data processing sys- 
tem console registers (see Figures 1 and 2 for exam- 
ples). Three test booklets were used under each ex- 


Character Registers 


y registers for Biquinary Code. 


384 


8 O 
40 
2 O 
1 @ 
1 


Address Registers 


Frank J. Minor and Stanley L. Revesman 


Character Registers 


Fic. 2. Examples of simulated display registers for Binary Coded Decimal Code. 


perimental code condition. Each test booklet con- 
sisted of 36 problems. Each problem is defined as 
consisting of a character register and an? address 
register. 

In one-half of the problems the subjects were re- 
quired to encode by placing an X in the appropriate 
circles. In the remaining one-half of the problems 
the test items were presented with Xs already in the 
printed circles and the subjects were required to de- 
code by placing the appropriate character on the an- 
swer line below the register. To control the level of 
difficulty of the test problems, all groups were given 
identical test problems which were structured in ac- 
cordance with their assigned experimental code. Se- 
lection of the test problems to be used, and the de- 
termination of whether they were to be encoded or 
decoded, was arrived at by random assignment. 


Procedure 


Each group was required to participate in two 
phases of the experiment. The two phases were ad- 
ministered on two consecutive days. Phase 1 was a 
one-hour group training session. Phase 2, adminis- 
tered on the following day, consisted of a group 
testing session, also approximately one hour long. 

Task. Subjects were required to code character 
registers and address registers in the manner in which 
such coding occurs in actual field use. Programers in 
the field perform address register coding without the 
aid of a master code sheet since the task is com- 
patible with established habits of handling decimal 
system numeric information. Character register cod- 
ing however is frequently conducted by reference to 
master code sheets such as used in this study. The 
code sheet is necessary because many programers do 
not perform coding tasks with sufficient frequency 
to be able to recall character code configuration. 

The format for the master code sheet for each 
code condition was identical. Letters of the alphabet 
were presented in alphabetic order in the first three 


rows, and the digits 0 to 9 were presented in nume 
order in the fourth row. This format facilitated" 
Systematic organization of the code sheet for ae 
encoding and decoding tasks, regardless of the one 
schema. There was no apparent variation of p. 
format which would improve the compatibility e 
tween the experimental task and any given masi 
code sheet. di- 
To perform encoding under any of the three per. 
tions, the subject would locate the desired let e 
digit, or symbol on the code sheet and make note A 
the circles containing an X. To decode under any o- 
the three conditions, the subject systematically 
cated the character by first determining whic the 
the four rows to search in and then locating 
character. F to 
Training. Master code sheets were distributed 
the members of each group during the pr i; 
sion. The code sheet corresponded to the unige 
condition assigned to the group. The.organiz ti 
the code sheet layout was explained to oR th 
search behavior habits. Test sheets identical E were 
to be used in the testing phase of the stu P 2 
distributed to the subjects. An experimente sy and 
strated to the subjects 10 representative deren ansta 
10 representative encoding problems. The de subject 
tion was followed by practice trials by the à encod- 
For each experimental condition, the se each 0 
ing and decoding problems were drawn fro: resented 
the four rows of the master code sheet and k a tim? 
to the subjects as examples. One problem Eg for 
was presented to each group and then C nit orine 
accuracy by both the subjects and two zm probe 
experimenters. Before continuing to the E errors 
lem, each subject was required to correct th xpe” 
If the subject could not demonstrate to, fin 
menter that he could perform the correction cid en- 
the experimenter would again demonstrate 
eral coding procedure to him. 


s 
set WH 
5 pject 
The criterion of success to which each sub) 


Binary Codes for Console Display 


trained was that of accurate unaided encoding and 
decoding of 24 consecutive coding problems. These 
problems were representative of all the types of prob- 
lems to be presented in the test session. Because of 
the simplicity and highly redundant nature of the 
task, no subject under any condition required aid 
following the first five practice problems for any of 
the problem types (i.e. decoding a character register 
or address register, encoding a character register or 
address register). The need for aid was no more fre- 
quent under one condition than any other condition. 

Upon completing the group training session the 
subjects were instructed that immediately prior to 
the testing session,'a demonstration of coding would 
be repeated. 

It was assumed that subjects would not rehearse 
the task in different amounts between the time of 
training and the time of testing for the following 
reasons: the task did not require memorization of 
stimulus code configuration, and therefore there was 
no need for rehearsal; because of the simplicity of 
the tasks the subjects reached complete mastery of 
the task early during the training session. 

Testing. As indicated above a brief review of cod- 
ing was presented to the subjects just prior to test- 
ing, Testing consisted of three trials with one test 
booklet per trial. Three sets of test. booklets (1; 2; 
and 3) were used, each set consisting of different 
problems. To make a valid statistical test of the 
learning effect across trials, each trial required a ran- 
dom assignment of the test booklets to the subjects 
within a given code condition, with the restriction 
that each subject perform on a different booklet on 
each trial, Subjects were instructed to complete all 
Problems in a test booklet as accurately and rapidly 
as possible, and to signal their completion by a raised 
hand. The time required by cach subject was re- 


Corded by the experimenters. 


Tine 
(Minutes) 


Combined Code 


Binary Coded Decimal 


1 2 3 


Fic. 3. Average time required to complete a test. 


RESULTS 


A 3 X 3 analysis of variance for repeated 
measurements on independent groups was per- 
formed for both the time and error criteria 
(Edwards, 1957). 


Time Criterion Analysis 


The 3 X 3 analysis of variance showed over- 
all significant differences among code condi- 
tions and among trials (p < .001). A test of 
interaction between trials and code conditions 
was insignificant. 

The results of the time criterion are pre- 
sented graphically (see Figure 3). The results 
of the analysis of variance are summarized in 
Table 3. 

To detect where the differences in mean 
time occurred between paired conditions, mul- 


TABLE 3 


SUMMARY OF 
witn THE 3 Cope COND 


ANALYSIS OF VARIANCE FOR Time REQUIRED FOR Test COMPLETION 
ITIONS WITH 3 TRIALS FOR Eacu Group 


Z 
Source of variation SS df MS x 
5 107.3 19.5* 
Between Codes I, II, III 214.5 2 ; 
Between Subjects in "c ka fa 
Same Group f 2 z 
Total between Subjects 498.7 5 
5 2 197.6 286.3* 
Between Trials A, B,G s 3 e 
Codes X Trials 
Pooled "ih es A 
Subjects X Trials na i 
Totals within Subjects ie 2 
1 970.1 
Tota 
E 


386 


tiple £4 tests were computed by the Scheffé 
(1953) method. It was found that the time 
required to perform the tasks using Code 2 
was significantly less than the time required 
using either Code 1 or Code 3 (p < 001 and 
p < .01, respectively). The time required for 
Code 3 was less than that of Code 1, but the 
difference was not statistically significant. 
Similar Scheffé tests were computed for the 
mean time required under each trial where 
the data were pooled for the three code condi- 
tions. The results showed that there was a 
significant reduction of the average time re- 
quired with every successive trial (p < .01). 


Analysis of Error Criterion 


The 3 x 3 analysis of variance showed no 
over-all significant difference in the number 
of errors among the code conditions. How- 
ever, there was a significant over-all differ- 
ence in mean number of errors among trials 
(p < .01). The test of interaction between 
code conditions and trials was not statistically 
significant. 

To detect where the differences in the mean 
number of errors occurred between paired 
trials, Scheffé sequential range tests were em- 
ployed. It was found that Trials 2 and 3 had 
significantly fewer errors than the first trial 
(p < .01 and p < .05, respectively). The dif- 
ference between Trials 2 and 3 was not sta- 
tistically significant. 

Additional analyses of the error criterion 
were performed. With the data of the three 
code conditions pooled, a £ test for matched 
groups showed that there were significantly 
fewer errors made in the address register than 
in the character register (p < .01). It was 
also found that there were significantly fewer 
errors made in the encoding process than in 
the decoding process (p < .01). 

No one condition had significantly more 
errors than another condition in either the 
character or address registers. In addition, no 
one condition had significantly more encoding 
or decoding errors than another condition. 

The level of difficulty of the three sets of 
tests employed in the study were statistically 
checked. Neither the time nor error criterion 
revealed significant differences between the 
three tests. 


Frank J. Minor and Stanley L. Revesman 


CONCLUSIONS AND DISCUSSION 


There were two major issues with which 
this study was concerned: (a) Which code 
system will permit the operator to utilize his 
code display registers with the most rapid 
rate of speed and result in the least number 
of errors? (b) With which code system can 
the operator improve his performance most 
rapidly? 

With reference to a, the results indicate that 
the Binary Coded Decimal system (Code 2) 
permits coding performance to be carried out 
at a rate of speed which is more rapid than 
either of the other two codes. It requires 25% 
less time when compared to the Biquinary 
Code and 20% less time when compare 
the Combined Code. 

An inspection of time criterion data sug- 
gests that the greater proportion of differences 
in average time required between conditions 
was a function of character register coding: 
The rationale of this conclusion is that ther 
was a significant difference between Code con 
dition 2 and Code Condition 3 although bot 
code systems had the Binary Coded Decima” 
system used in the address register. It 15 mH 
pothesized that the Biquinary Code gysten 
require greater coding time as a functio? 
the additional operator decision time 
bit is to be read on the left or right side ° 
that bit. of 

It is believed that the time advantage 5 . 
the Binary Coded Decimal would be & ae 
manent one. The rationale for this 2s5¥ ie 
tion is as follows: The coding task 25 pe 
formed at the data processing console By not 
one programer assigned to the sitem. na” 
a high frequency task as compared to w pro” 
jor task of developing and writing ne gram 
grams. Consequently, a great many pro codes 
ers do not practice sufficiently with the con" 
to memorize individual character om fiel 
figurations. Coding performance 1n t aid of 
therefore is usually conducted with ae ex 
a master code sheet, as simulated 1" 
periment. Because of forgetting, it 1s 
sized, that in the field, the level of 
performance and the performance md uld 
one console coding session to the ne* 


Binary Codes for 


be very similar to the performance profiles of 
this experiment. 

The analysis of the error criterion showed 
no significant statistical differences between 
the code conditions. This finding suggests that 
the rate of error would not be expected to 
differ for the three code conditions in field 
operations. 

With reference to b, concerning the rate of 
performance improvement, the significant re- 
duction in time required, and number of errors 
in successive trials, indicates that improved 
performance did take place. Under no code 
condition was learning rate more rapid than 
another condition. This conclusion is sup- 
ported by the statistical finding that interac- 
tion between trials and code systems was m- 
significant. Therefore, the slopes of the plotted 
learning curves under each experimental code 
were near parallel for all practical purposes. 

'The results of this experiment have prac- 
tical implications for selecting a code for a 
data processing system console. As stated ear- 
lier, the Binary Coded Decimal schema was 
the most efficient in terms of cost and elec- 
tronic simplicity. The savings in operator cod- 
ing time at the console, which would ensue 
from the use of this code, further justifies 
its use, 

During program check 
Procedures at the console, 
System is nonproductive. 


ing and debugging 
the data processing 
Any reduction in 


Console Display 387 
system downtime constitutes an increase in 
system efficiency. There are many installa- 
tions where new and complex programs must 
frequently be developed and checked for ac- 
curacy. It is in such installations that a 20— 
25% reduction in debugging time at the con- 
sole can result in a practical increase in sys- 
tem productivity. 


SUMMARY 


Three binary code schemas were evaluated 
experimentally in terms of operator coding 
performance. One of the three code schemas 
was to be selected for use on a data processing 
system console display. The criterion for the 
code evaluation was the operator efficiency 
measured in terms of speed and accuracy 
of coding console problems. An independent 
group of subjects was assigned to each of the 
three code conditions. One of the three code 
schemas facilitated a significant time saving 
of 20-25% as compared to the remaining two 
code schemas. There were no differences in 
rate of error between the three code condi- 
tions. 

REFERENCES 


Epwanps, A. L. Experimental design in psychological 
research. New York: Rinehart, 1957. 

Scuerré, H. A method for judging all contrasts in 
the analysis of variance, Biometrika, 1953, 40, 87— 


104. 
(Received November 16, 1960) 


nal oj Applied Psychology 
1961 Vol. 45, No. 6, 388-392 


s my: 
SUPERVISORY PROCEDURES AND WORK-TEAM PRODUCTIVITY 


J. S. KIDD axp R. T. CHRISTY 


Ohio State University 


In spite of the intensive concern of social 
psychologists with the general phenomenon of 
leadership, only fragmentary data are avail- 
able concerning the interrrelations between 
the role specifications of the leader, the indi- 
vidual characteristics of the leader, the op- 
erational or task setting, and team produc- 
tivity. Ever since the Hawthorne studies 
(Roethlisberger & Dickson, 1939) it has been 
recognized that the supervisor of a task ori- 
ented group or team can influence system- 
atically its productive output. Lewin and his 
colleagues were able to demonstrate that the 
way in which leadership responsibilities were 
carried out was an important factor in de- 
termining many aspects of group behavior 
(Lewin, Lippit, & White, 1939). Again, in the 
context of corporate organizations, Shartle 
(1956) has pointed out that the personal 
Characteristics of the executive are likely to 
predominate over the formal role require- 
ments in determining the manner in which 
the leadership functions are fulfilled. 

The present experiment was in no way in- 
tended to bring closure to the complex ques- 
tions surrounding leadership role and team 
productivity. Its limited purposes were: to 
provide some insight into the relative influ- 
ence of individuals vs. roles, to test the hy- 
pothesis that differential supervisory role 
Specifications can have an effect upon team 
performance, to compare the effect of the dif- 
ferent supervisory roles on more than one 
class of team performance criteria, and to de- 
velop some guidelines to facilitate the man- 
agement of multiple team activities, especially 


those in which the team is to function in a 
man-machine system, 

1 This research was carried 
of Aviation Psychology and 
by the United States Air Fo: 
AF 33(616) -6166, monitore: 
cal Division. Permission is granted for reproduction, 
translation, publication, use, and disposal in whole 
and in part by or for the United States Government. 


out in the Laboratory 
Was supported in part 
rce under Contract No. 
d by the Aerospace Medi- 


METHOD 


Apparatus and Task Setting 


The general task environment was provided d 
simulation within the laboratory of a radar air tra B 
control center. The focus oí observation was à ey 
man team consisting of two operators and a ed 
visor. This team was made responsible for the gui P 
ance of simulated aircraft through the prés] 
phases of a landing approach. The task invo z 
pickup and acknowledgment of aircraft enters 
specified zone of responsibility, guidance of iuda 
flight course over a 50-mile approach route, Ca 
and airspeed adjustment prerequisite to actual P x 
ing, and positioning the aircraft for acceptance Eie 
subsequent control agency for the final phase 0 
landing process. All task inputs were simulated. sally 

The simulation was implemented by the SPA nia 
developed OSU Electronic Air Traffic Control Sid 
lator. This device, which is built around an mamie 
computer, is capable of generating up to 30 x 10 
aircraft targets and presenting them He lay- 
the radar controller via a cathode ray tube ished 
Direct manipulation of the “aircraft” is dr part 
by college students trained to faithfully i ircradt 
of pilots. In addition to the visual display ot 2. direct 
position available to the controller, he A E D his 
auditory communication with the “pilots a 
jurisdiction through simulated radio channels. 


€ 
s n „mant 
Experimental Variables and Perfor 
Measures adin 
" ai 
Three types of supervisor roles were yet jnpu 
a context which included two levels of et cfined 
rate (load). The three supervisory roles W 


E 
as follows: visor W? 

Laissez-faire. In this condition the supe. goin 
instructed to act as a passive monitor of t yas 


operation, He was available for consultation * by 
permitted to respond to direct queno | st 
the controllers. He was responsible for ee so 7€ 
of the situation, but did not intrude unles 
uested by one of the controllers. a acte 
3 Mee eor: In this role, the supervisor efullY 
as a “super-controller.” He would OSES p ipate 
the progress of the operation in order to f 
difficulties and detect controller errors. Y s 
condition arose, the supervisor was to gIV' Ha 
instructions to the controller involved so aS 
the latter to correct the situation. H 
Direct participant. This role differed en rected 
ceding primarily in that the supervisor ur wit m 
to take remedial actions by direct contac proier 
pilots. Thus, instead of requesting the CO 


388 


Supervisory Procedures and Productivity 


carry out some corrective action, he could take the 
corrective action at his own discretion through a di- 
rect communication contact with the appropriate 
pilots. Coordination was automatic since the “chan- 
nel" used by the supervisor was the same as that 
used by the controller, and any communication be- 
tween supervisor and the pilots under a given con- 
troller's jurisdiction was overheard by that controller. 

These three conditions can be regarded as defining 
a rough continuum of supervisory intervention. In 
all cases, however, the supervisor's actions are mod- 
crated by the presence and actions of his subordi- 
nates. Thus, even in the most extreme case, the su- 
pervisor does not have an opportunity to impose 
absolutistic control. 

Two conditions of traffic input rate were utilized 
as a check on possible interaction effects. Rates of 
one aircraít entry every 43 seconds and one entry 
every 90 seconds were used. 

Four performance measures were recorded. Mean 
percent delay was employed as the primary index of 


389 


system eíficiency. It reflects the rapidity with which 
aircraft are moved through the system and is defined 
as follows: 
y Actual — Hypothetical minimum flight time 
Inn Hypothetical minimum flight time 
Number of aircraft 


A second measure was derived from the use, in the 
present study, of preprogramed pilot errors. Records 
were taken of the time lag between the occurrence 
of such a preprogramed error and its correction by 
the controller responsible. The programed errors were 
in the form of deviations from assigned heading. The 
third criterion consisted of a count of the number of 
controller errors made in positioning aircraft for the 
final portion of the landing process. The fourth meas- 
ure was a count of the number of controller errors 
involving aircraft separation. If one aircraft “flew” 
within 2-miles lateral or 1,000-feet vertical distance 
of another aircraft, the controller was scored with a 
separation error. 


TABLE 1 
STATISTICAL DESIGN 


Subjects (teams) 


Order of " SH enu = 
sessions Conditions 1 2 3 4 5 6 
Supervisor A c n B A C 
Supervisory Laissez- Laissez- Direct Direct Active Active 
1 role faire faire participant participant monitor monitor 
Load Low High High Low High Low 
Supervisor A C P B i e 
Supervisory Laissez- Laissez- Direct Direct Active Active 
T 5 role faire faire participant participant monitor monitor 
Load High Low Low High Low High 
i . z 4 a 
Supervisor [o B S * ^ A 
Supervisory Direct Active Active Laissez- Lais ez- Direct 
Í ess à participant monitor monitor faire faire participant 
à ia à : j 
Load High Low High High Low Low 
" ~ C C B A 
Supervisor C B g : : $ 
` isory Direct Active Active Laissez- Laissez- Direct 
Sia y participant monitor monitor faire faire participant 
4 role : a F " High . 
= High Low Low igl High 
Load Low ne 
A A A G B 
Š i p A 
jsor 3 - H Š 
hide = Active Direct Laissez- Active Direct Laissez- 
| Bopa aonitor participant faire monitor paritcipant faire 
5 role Low Low High High 
igh Low LOW g ig 
Load Hig 
A (6: B 
Supervisor - à Active Direct Laisse: 
" Active Direct Ave mec Jnissez- 
Supervisory onitor participant faire monitor participant faire 
-ole me T y 
? E Low High High High Low Low 
Load = 


390 


C Low Input Load 
High Input Load 


140 


120 


Deloy 
o 
o 


Meon Per Cent 
o 
o 


Loissez- Foire Active Monitor Direct Participant 


Supervisory Role 


Fic. 1. Mean percent flight delay under various con- 
ditions of supervisory role and input load. 


Subjects and Procedure 


Specially trained colleze students participated both 
as controllers and as supervisors. Six two-man con- 
trol teams were formed by random pairing from a 
12-man sample. The three supervisors each had more 
than 2.5 years experience in the laboratory as con- 
trollers and had filled various leadership positions 
during this time. The test trials were arranged so 
that each supervisor had one opportunity to work 
with each of the teams. Thus, no control team ex- 
perienced a supervisor in more than one role, al- 
though, in going from team to team, the supervisor 
changed roles. A modified greco-latin square design 
was used so as to balance load, role, and supervisor 


factors with practice effects. The design is detailed in 
Table 1. 


A given experimental session represented a particu- 
lar combination of one control team, a supervisor, 
and a role. Each session was 3.5 hours in duration 
and incorporated both input load conditions. Eighteen 


such sessions were required to complete the design 
requirements. 


RESULTS 


Three factors are of analytic concern: the 
influence of the supervisors as individuals on 
system performance, the influence of super- 
visory roles, and the influence of input load. 
All three factors and their interactions were 
compared in terms of the mean percent delay 
criteria. The remaining criteria were employed 
only with respect to the comparison of the 
effects of supervisor roles. 

The first step in analysis was an over-all 
comparison by analysis of variance of the 
three factors using the delay criteria, This 
was done primarily to assess the compara- 
tive variance contributed by supervisors as 
individuals vs. the variance contributed by 


J. S. Kidd and R. T. Christy 


the role assignments and as a check on ae 
actions rather than as a direct comparison 0 
the conditions within a given factor. The in- 
fluence of supervisors as individuals yielded a 
mean square variance 3.7 times greater than 
that derived from the role factor. The inter- 
actions of role and supervisor and of role and 
load were statistically negligible. The influ- 
ence of these factors may be seen directly by 
reference to Figures 1 and 2. The relation- 
ship between role and supervisor is of po- 
tential interest since the best of the three Su- 
pervisors is also the least variable insofar aS 
this can be assessed by the performance 9 
his crews. 

Since the effect of interaction between the 
experimental variables was relatively minor; 
direct comparison of conditions within we 
tors was undertaken by nonparametric "EC 
ods. Table 2 presents the comparison of e 
three supervisory roles in terms of all i 
criteria of system performance. In all = 
the analysis is carried out between pairs "s 
conditions, since they represent discrete ca 
gories rather than a continuous dimension. | 
will be noted that the laissez-faire role is 
perior according to the delay criterion 


ance of separation errors. was 
The input load factor evaluated Mueren 
highly significant. Under low load con 


i404 O Supervisor A 
C Supervisor B 
Supervisor C 


120] 
F100} 
8 Z A 
HS f / 
e A 
ë eo A / 
; ' ' 
A A 


irect Po! 
Active Monitor Direc 


Loissez-Foire 
Supervisory Role aier" 
: ii 
vith ; 
Fic. 2. Mean percent flight delay wit pervis” 
individual supervisors taking various 


roles. 


——. 


Supervisory Procedures and Productivity 


391 


TABLE 2 
Tur INFLUENCE OF SUPERVISORY ROLE ON VARIOUS MEASURES OF SYSTEM PERFORMANCE 


Role designation 


Paired tests* 


Laissez-faire Active monitor 


Laissez-faire s. vs. 
Laissez- ^ Active Direct vs. direct par- direct par- 
Criteria faire monitor participant active monitor ticipant ticipant 
Mean 
percent 99.2 116.5 107.4 ditd: > 0 ns ns 
delay p = 062 
Pilot crror 
detection 92.7 82.2 97.8 ns ns di+d:>0 
lag p = .062 
GCA 
go-arounds 4.7 2.6 2.9 di+d:>0 d > 0 ns 
per b = .002 p 031 
100 landings 
Separation 
errors per 4.5 3.9 2.9 ns did-ds0 ns 
100 landings p = .062 


a The Walsh test was used throughout (Siegel, 1956). 


the average percent delay was 97.8, while un- 
der high load conditions it was 117.6. By the 
Walsh test, such a difference has a probability 
of chance occurrence of .031. 


DISCUSSION 


The relatively small sample size employed 
in the present experiment limits, to some de- 
Bree, the extent to which generalizations can 
be made, The major virtue of the experiment 
ls perhaps in the fact that problems usually 
Susceptible only to field investigation were 

rought into the controlled laboratory envi- 
Tonment, 

Thus, some quantitative validity can be 
Siven to the proposition that individual su- 
Dervisor characteristics are more influential 
On work. team performance than are the vari- 
Ous standard styles of carrying out the super- 
Visory function, The problem remains, how- 

Ver, as to just exactly what are such personal 
Characteristics, Although there are supervisor 
qu tion techniques available, further vigor- 
ut Study of the problem should be given 

*d incentive by the present findings. 
Der e comparison of the supervisory roles, 
Se, leads to the proposition that there is 


a distinct trade-off effect between performance 
criteria. Thus, the laissez-faire role with its 
nonintervention feature allows the controllers 
to concentrate on maintaining a rapid flow 
through the system. The controller apparently 
accomplishes this particular end with a cer- 
tain recklessness since his error scores are 
relatively high. When the supervisor expresses 
his role more aggressively, errors are reduced, 
but at a cost in the rapidity of flow. The 
management implication seems clear. Recom- 
mended supervisory procedure depends on the 
importance of speed vs. accuracy in the task 
at hand. Apparently, high levels of accuracy 
demand more overt intervention by the su- 
pervisor. When both are equally important, a 
moderate level of intervention would seem 
most appropriate. 

This conclusion should also be tempered by 
the recognition that novice controllers will re- 
quire more active participation on the part of 
their supervisors so that minimal performance 
levels are consistently assured. Thus, adop- 
tion of the laissez-faire approach would de- 
pend on both the nature of the task and the 
level of achieved skill on the part of the op- 
erators. ^ 


392 


In the present study, the supervised con- 
trollers were informally interviewed subse- 
quent to their participation in the experiment. 
Their comments were fairly consistent: they 
preferred the laissez-faire condition. Their re- 
sponse to the other conditions changed with 
time. At first, they were somewhat resentful 
to the supervisor's "interference," but as time 
passed they were more inclined to perceive 
the supervisor as at least well intentioned. 
Therefore, an effective supervisor might be 
one who promptly made his intentions mani- 
fest to his team. 


SUMMARY 


A complex task setting provided by the 
simulation of a radar air traffic control sys- 
tem was the context employed to evaluate 
the extent to which work-team productivity is 
modifiable as a consequence of different su- 
pervisors and supervisory procedures under 
different task loads. Six two-man teams were 
observed. Each team worked under each of 
three supervisors. The supervisors shifted 
from team to team. Three techniques were 


J. S. Kidd and R. T. Christy 


adapted alternately: laissez-faire, active mont- 
toring, and direct participation. 

It was observed that the individual super- 
visor was a more consistent influence on per- 
formance than the particular role he em- 
ployed. The effect of role per se was signifi- 
cant but interpretable only in light of the 
particular performance criterion used. For e 
ample, processing speed was greatest un er 
laissez-faire conditions, while error avoidance 
was superior under the conditions which e 
quired more overt supervisory activity. Tash 
load was not found to be an interactive factor. 


REFERENCES 
Lewy, K., Lieerr, Rẹ, & WHITE, R. K. Patterns ia 
aggressive behavior in experimentally created “S 
cial climates.” J. soc. Psychol., 1939, 10, 271-299. 
ROETHLISBERGER, F. J, & Dicksox, W. J. Manata 
ment and the worker. Cambridge: Harvard Univer: 
Press, 1939. d 
SHARTLE, C. L. Executive performance and leadta 
ship. Englewood Cliffs, N. J.: Prentice-Hall, D E 
SIEGEL, S. Nonparametric statistics for the behavie 
sciences. New York: McGraw-Hill, 1956. 


(Received November 25, 1960) 


Journal of Applied Psychology 
1961, Vol. 45, No. 6, 393-401 


BALES’ INTERACTION PROCESS ANALYSIS OF 
PERSONNEL SELECTION INTERVIEWS* 


DANIEL SYDIAHA 


University of S 


The purpose of this investigation was to 
examine whether Bales’ (1950) categories of 
interaction process analysis were related to 
decisions made by personnel selection inter- 
viewers. 

It has been generally recognized that de- 
cisons made by interviewers are notoriously 
unreliable, in the sense that assessments made 
by two or more interviewers of the same ap- 
plicants tend to differ markedly. The ques- 
tion as to the sources of this unreliability, 
however, has not been dealt with in the lit- 
erature, except for casual or speculative men- 
tion of various situational factors such as un- 
systematic questioning or interviewer bias. 
In the absence of empirically supported gen- 
eralizations about sources of error, the prac- 
tising interviewer is left with little more than 
à feeling of disdain toward a technique which 
is generally believed to be inadequate. 

'The hypothesis advanced in the work re- 
Ported here is that decisions made by inter- 
Viewers in personnel selection are function- 
ally related to measurable dimensions of in- 
terview process and content. The point of 
View adopted is essentially Brunswick's no- 
tion of probabilistic functionalism (Bruns- 
Wick, 1956). While this is admittedly a sim- 
ble hypothesis, it is nonetheless not a naive 
One, since the more common assumption about 
Interviewing is that interview process and 
Content are unsystematic and chaotic. The 
®pproach taken has been frankly empirical, 
plying no strong predisposition as to the 

ature of interview process, but rather an at- 
€mpt is made at developing three sets of 
Variables which have some plausibility in be- 
h Board of Canada, 
ster, McGill Univer- 
and Allen Clark 
computations. The 
he assistance of all 
ing information, 
Canadian Army 


Gy, anced by Defence Researc 
si; t No. 9435-53 to E. C. Web 
ang ebster directed the researc 
Sut Sohn Kenyon assisted. with 
d Sratefully acknowledges t ie 
nq Wers who assisted in prov id 
Person Particular, members of the 
"nel Selection Service. 
393 


askat chewan 


ing functionally related to decisions. These 
variables are: (a) information bias, reported 
in a previous paper (Sydiaha, 1959); (5) 
Bales’ interaction process analysis, reported 
here; and (c) interviewer empathy, to be re- 
ported in a subsequent paper. 

In casting about for measures which might 
be expected to show some uniformity in 
interview situations, Bales’ system of interac- 
tion process analysis was selected as a promis- 
ing method because of Bales’ success in dem- 
onstrating the existence of certain uniformi- 
ties in group process (see, for example, Bales, 
1950; and articles in Hare, Borgatta, & Bales, 
1955; Maccoby, Newcomb, & Hartley, 1958; 
Swanson, Newcomb, & Hartley, 1952). These 
uniformities in interaction patterns have lead 
to some useful theoretical statements on the 
nature of group process. Other more specific 
differentiations have been made of more di- 
rect relevance to decision making in interview- 
ing, such as the differences in interaction pro- 
files between (a) successful and unsuccessful 
groups, as described by group participants; 
and (5) group leaders who are task oriented 
(the “idea-man”) and person oriented (the 
“social-emotional specialist). 

The interaction categories are described 
elsewhere (Bales, 1950).? Briefly it may be 
noted that a distinction is made between acts 
which are classified in “task” categories (4, 5, 
6, 7, 8, and 9) and “social-emotional” cate- 
gories (1, 2, 3, and 10, 11, 12). Within these 
two broad divisions, there is a further division 
of task acts into questions (Categories 7, 8, 
9) and attempted answers (4, 5, 6) while so- 


2 The 12 Bales categories are briefly designated as 
follows: 1. shows solidarity, 2. shows tension re- 
lease, 3. agrees, 4. gives suggestions, 5. gives opinion, 
6. gives information, 7. asks for information, 8. asks 
ior opinion, 9. asks for suggestion, 10. disagrees, 11 
shows tension, 12. shows antagonism. Two additional 
categories used in this study were: 7A. asks for in- 
formation so as to require "yes" or “no” responses, 
6A. “yes” or “no” responses to 7A questions. i 


394 


cial-emotional acts divide into positive (1, 2; 
3) and negative (10, 11, 12) acts. In addi- 
tion to these 12 categories proposed by Bales, 
two other categories were added for this 
study, namely, Categories 6A and 7A, these 
being subcategories of 6 and 7. Category 7A 
acts were those usually scored in Category 7 
(“asks for information, orientation, etc.") 
but which are framed in such a way as to re- 
quire a simple “yes” or “no” response. Simi- 
larly Category 6A acts were those usually 
Scored in Category 6 ("gives information, 
orientation, etc.") but they are expressed as 
simply “yes” or “no.” The purpose of adding 
these categories was to test the consequences 
of what might be termed “directive” ques- 
tioning in which the applicant has little or no 
control over the information communicated 
but can only reply “yes” or “no” to ques- 
tions put to him. This pattern of questioning 
is in contrast to a more nondirective situa- 
tion in which questions are open ended, the 
applicant is given considerable freedom of 
choice in his responses and, in a sense, is able 
to exert some control over the interview. It 
seemed apparent, from informal observations 
made during preliminary stages of the study, 
that directive questioning tended to have one 
of two consequences: either an atmosphere 
was created of the interviewer cross-examin- 
ing the applicant, or it signified that the in- 
terviewer had lost control of the situation in 
that he had exhausted his supply of questions 
and was frantically searching for things to 
talk about. In either case, it seemed apparent 
that the situation was an uncomfortable one 
for both applicant and interviewer, and tended 
to be associated with rejection of the appli- 
cation. 
The following hypotheses were proposed, 

consistent with the above discussion: 


Hi: Comparing interviews in which appli- 
cants are accepted with those in which appli- 
cants are rejected: (a) interaction Scores are 
higher for Categories 1, 2, 3, 4, 5, 6, and (b) 
lower for Categories 7, 8, 9, 10, 11, 12, 6A, 
and 7A. 

H,: (Consistent with the notion of inter- 
interviewer error) Differences in interaction 
Scores between accept and reject cases are 
greater for some interviewers than for others. 


Daniel Sydiaha 


Two studies were executed, designated Stud- 
ies I and II. 


Srupy I 
Procedure 


Training in the use of Bales interaction process 
analysis. The method oí scoring described by iem i 
(1950) was adhered to as closely as possible. spun 
recordings oí interviews were obtained by the m 
perimenter (the writer) and scoring of these. in o: 
views was repeated until two consecutive scorings t 
the same interviews gave comparable results. er 
parability was tested by means of the chi square Me 
as described in Bales (1950). To check bea 
parability of results, another scorer (A. Clark) en 
ceived some training in the scoring method, S 
marked discrepancies between the two scorers e. 
discussed, and criteria for agreement decided ap me 
in light of Bales’ instructions. As an added ou 
scoring, the experimenter visited Bales at Harva 
and discussed scoring problems with him.? aie je 

Training was carried out with a sample of Y E 
terviews. Upon completion of training, a samp s 
67 interviews was scored twice to test the err 
ducibility of scoring. Time intervals between ay E. 
tive scorings varied from 6 hours to 3 wee S i 
each scoring category, a correlation. (Pearson P F 
uct-moment) was calculated between the two Se E 
scores. The results are shown in Table 1. babe 
ducibility of scores shown in Table 1 is high f the 
the correlations are undoubtedly elevated i pos 
recollection of the first scoring on the secon 
casion, -— de 

Subjects. One hundred fourteen ghee. 
obtained from nine interviewers, with en view 
9-24 per interviewer. All applicants and in ‘a a 
ers were male. The interviewers included $^ gn 
nadian Army personnel officers, one Royal employ- 
Air Force career counselor, one industrial ' in pii- 
ment interviewer, and a psychologist engage officers 
vate consulting practice. Five of the a 
examined recruit applicants and the sixth e 


amine’ 


are 


r i dy 
3 Bales’ assistance and interest in this stu 
gratefully acknowledged. 


—————— 


TABLE 1 
CORRELATIONS (Pearson Product-Moment) 


peTWE” 


TI 
-p INTERVIEY 
Two REPEATED SCORINGS or RECORDED INT 
(N = 67) ——— 
= = = = "i , r = 
Category r Category or Category 
= -- E 
1 89 6 97 $ za 
2. 340 GA 95 10 Ge 
3 98 ) 98 $: 90 
4 8o 7A 08 M 
5 96 8 87 i 
jd were j 


i ently an 
* Acts in Category 9 occurred very infrequen 
analyzed. 


Interaction. Analysis of Selection Interviews 395 


oficer applicants. The Air Force interviewer ex- 
amined applicants for both air crew and administra- 
tive positions. The industrial interviewer examined 
principally junior office and clerical applicants. The 
subjects obtained from the psychologist varied widely 
in terms of qualifications, from industrial apprentices 
to senior executives, and for purposes of this study, 
the sample was divided into *Junior" and "Senior" 
groups. The number of interviews in the 10 groups 
varied from 9 to 15. 

The selection of the samples of interviewers and 
applicants is in no sense to be considered representa- 
tive of any population of selection interviews regu- 
larly conducted, but rather, merely a sample of ac- 
cessible interviews obtained from interviewers who 
could be persuaded to contribute data for the study. 

Conditions of interviewing. The interviews were 
"natural" except that (a) in most cases (approxi- 
mately 80%) the experimenter was present in the 
interview room as an observer, and (b) except for 
five interviews, all interviews were sound recorded, 
Concealment of the sound recorder was left to the 
discretion of the interviewer, and in about 60% of 
the cases the applicant did not know that there was 
a recorder in the room. 

Upon completion of each interview, the inter- 
viewer recorded his recommendation as to whether 
the applicant should be accepted, rejected, or whether 
the case was marginal or doubtful. Because the inter- 
Views were selected more or less at random from the 
Stream of applicants examined, it was not possible 
to ensure that the numbers of cases would be the 
Same for all three categories or for all interviewers. 
Th some instances only a single case was obtained for 
4 particular decision category. Since no estimate of 
Variance could be made in this instance, it was de- 
cided to combine the “marginal” and “reject” cate- 
Bories, thereby including all cases obtained for the 
analysis, Admittedly, this combining of judgment 
Categories produced a questionable division of cases 
Since many of the marginal cases were in fact ac- 
Cepted for employment although for purposes of 
analysis they were classified as having been rejected. 

n defence of this procedure, it is to be noted that 

© criteria for employment were necessarily not uni- 

Orm because of differing employment settings in 
Which the interviewers worked. Furthermore, Study T 
Was a preliminary one only, and was designed to 
‘est whether differences could be found between any 

cision categories, Study II made use of consistent 
aS Well as working criteria of acceptance-rejection. 

.“nalysis of data. Interaction scores for each inter- 

tew Were transformed into percentage figures in or- 
Cent 9 equate for interview duration, barbae an 
tom S€ figures for each category bh perd 
tio Plex analysis of variance with double MR 

D. In Other words, a separate analysis of varianc 
table SEM f the 13 categories. 
(Cag VS prepared for each of the uc 
LC 9 was excluded from analysis because "s 
sis} 190 few acts in that category to warrant analy- 
Sente ^ each analysis of variance table, ciu pia 
tate; decision categories, columns represente P x 

Plerviewer. eroups, and the interactions rep 


TABLE 2 
OF ANALYSIS OF VARIANCE 
FOR SCORING CATEGORIES 
(Study I, Y = 114) 


MEAN SqQuar 
PON 


Com- 


R a 
Category. (4/21) — (dro) 
1 1150" — 2s 
2 13:90 97.95** EH 
3 45.98% — 5 8.98 
4 2:60 log 
ti 1124 21119 
6 69.80 A145 
6A 3.07 4.01 
7 8.98 12:82 
7A 10.68 4824 
s 15.40 6.64 
10 1.22* 18 
m Em 217 
12 13.76 120 
i 
23 1979.26% $2,70 91.67 
lt 2+ 3 190,33" 469,26 1834 1942 
4+ 546 101 139,71" 15.28%! 2 
7+ 8+ 9 3329 488.10" 33.10 
10 +11 -12 49,34 26.66 11.70% 


nples. R = decision categories, 


cant at 5% level, 
cant at 1% level, 


i igni: 
statistically signifi 


resented interviewer-decision category interactions, 
Since all frequencies were unequal, it was necessary 
to make adjustments by the method of expected 
equal frequencies, as described in Ferguson (1959), 

Other variables analyzed were duration of inter- 
view, and each of the following combined scores: 
1+2+3, 4+5+6, 7484-9, 10 +11 4- 12. Mean 
squares for row, column, interaction, and individual 
interview components are shown in Table 2. F ratios 
were calculated for all main effects and interactions, 
Following McNemar (1955), the denominators used 
for the F ratio were: individual interviewers mean 
square to test for the significance of interactions, and 
interactions mean square to test for the significance 
of main effects. 


Results 


Decision category differences, Significant 
differences between decision categories were 
obtained for Categories 1 (shows support), 
3 (agrees), 1+2 4- 3 (positive social emo- 
tional acts), and 10 (disagrees). Mean cate- 
gory scores, not presented in this paper, were 
consistent with expectation: “accept” inter- 
views were characterized by high scores in 
positive social emotional categories, and by 
low scores in Category 10. 

It should be noted that of these four cate- 
gories, three also showed significant differ- 
ences among interviewers (Categories 1,3 
and 1 +2 +3). For these categories, then, 


396 


mean scores are not comparable for all inter- 
viewers. 

Interviewer-decision category interactions. 
Significant interactions were obtained for 
Categories 4 (gives direction), 8 (asks for 
opinion), 12 (shows antagonism), 4 -+ 5 + 6 
(problem solving attempts), and 10-+ 11 
+ 12 (negative social emotional acts). Mean 
category scores were again consistent with ex- 
pectation, although there were some reversals. 
In general, "accept" interviews tended to be 
characterized by high scores in Categories 4 
and 4 + 5 4-6, and by low Scores in Cate- 
gories 8, 12, and 10 + 11 + 12. 

These results imply that decision making is 
related to frequency of occurrence of acts in 
these categories, but not for all interviewers, 
These categories suggest sources of interview 
process which account for interinterview error. 

Of the five Categories, two also showed sig- 
nificant differences among interviewers (Cate- 
gories 8 and 445 4 6). This implies non- 
comparability of scores among interviewers, 
although these findings have no special sig- 
nificance since, in any case, all significant in- 
teractions imply noncomparability of scores. 

Interviewer differences, Significant differ- 
ences between interviewers were obtained for 
Categories 2 (shows satisfaction), 6A. (gives 
information in “yes” or “no” terms), 7 (asks 
for information), 7A (asks for information 
with questions to be answered as "yes" or 
^no"), 11 (shows tension), duration of inter- 
view, and 7 -- 84-9 (questions). These re- 
sults are of no special significance, however, 


since they are not related to the decision cate- 
gories. 


Discussion 


Hypotheses 1 and 2 were confirmed: some 
of the Bales scoring categories were correlated 
with decisions made by personnel selection 
interviewers. In general favorable decisions 
were characterized by high scores in problem 
solving attempts and Positive social emotional 
acts, and by low scores in questions and nega- 
tive social-emotional acts. However, the dif- 
ferences between decisions were not uniform 
among interviewers. 

In general, the results of this preliminary 
study showed promise of depicting something 
about the nature of interview process, with 


Daniel Sydiaha 


regard to uniformities both specific and com- 
mon to interviewers. : 
Demonstrating the presence of a relation- 
ship between categories and decisions e 
unanswered the question of the magnitude o 
the relationship. Because the interview groups 
in Study I were too small to permit the calcu- 
lation of correlation coefficients between vari- 
ables, Study II was conducted with larger 
numbers of interviews per interviewer to pera 
mit such estimates of degree of relationship. 
Furthermore, all cases were obtained from 
one organization (the Canadian Army) in or- 
der to standardize the criteria of acceptance- 
rejection used. . zM 
Study II was also set up to examine the e 
portance of interviewer and applicant wh. 
considered separately. Up to. this point, M. 
category score had included the combined e 
of both applicant and interviewer and the T 
sults considered in terms of the relevance E 
interaction scores to decision making, eu. 
less of which participant contributed acts 3 
the conversation. Separate participant Lo , 
on the other hand, are related to questions A 
individual role performance as it relates 
decision making in the interview. s b 
Some evidence in support of distinctive pi 
plicant and interviewer profiles was obtuse 
from the data in Study I, but it was not i 
cluded in Table 2 because many of the er 
gory scores were found to be heterogene t 
thus rendering use of the F ratio SOLUM p 
tenuous. Analysis of the data using jance 
man's nonparametric analysis of Mer be- 
(see Ferguson, 1959) yielded Lage t 
tween decision categories for some Esp oe 
since the statistical procedure does not onsid- 
interaction effects, the data were har 2, 
ered comparable to those shown in for at 
and are omitted from this paper 
reason. 


Srupv II 
Procedure 


Details of the study are described us 
(Sydiaha, 1959). Each of eight eror from 
Regular Force personnel officers Ee à jm 
13 to 50 Regular Force applicants for but data n 
Army. Total N for the project was 256, ions p 
Bales data were missing in 6 cases. Inte except Bes 
conducted under usual Army circumstances 1 "m 
additional research material requested for 


e 
wher 
else my 


Interaction Analysis of Selection Interviews 


ies herein reported, and the interviews were sound 
recorded. The recording machines were in plain view 
of all applicants, and they were informed that the 
recording was to be used for research purposes. Only 
the applicant and the interviewer were present in the 
room during each interview. Each case was assessed 
by only one officer, who classified the applicant as 
either “accepted” or “rejected.” Rejected cases in- 
cluded 20 applicants recommended for assessment at 
a later date and 7 applicants referred for psychiatric 
assessment, Accepted cases included 29 applicants 
considered marginally suitable for Army service. 

The cases were divided into three separate groups 
to permit cross-validation of findings. Cases provided 
by Officers A, B, F, and G (N — 37, 50, 50, 41) 
were randomly assigned to a criterion group Q 
= 88) and a holdout group (N = 90). The remaining 
cases, ie, those provided by Officers C, D, E, and 
H (N = 13, 23, 18, 18) made up a second holdout 
group (N — 72). . 

Scoring was performed from disc sound recordings 
by the experimenter (the writer) as in Study I. For 
each interview, interaction scores were converted to 
percentages for the applicant and interviewer scores 
separately in order to partial out the effects of the 
total amount spoken by each of the participants. 
(These were treated as separate scores.) The gen- 
eral statistical procedure followed was to combine all 
Scores mathematically in such a way as to maximize 
the correlation. between combined scores and ae 
Ceptance-rejection for the criterion group ana "n 
Scoring key was then applied to the holdou = p 
to test the relation between scores and accep A 
rejection. Three such combined scores p ui an 
henceforth as "indices") were developed: the p 
Dlicant Index" (made up of applicant scores), : * 
"Interviewer Index" (made up of mieye scores 3 
and the “Interview Duration Index ume 
the total number of acts scored for each $ iid ^ 
Participants and the duration of the intervie 
minutes) , 

he NN function was used to OH 

the Interview Duration Index. This piu Pis 
Not felt to be justified for making 3n NU. 
Indices, however, because of the large 


P P for example 
Variables involved. Guilford er be: 


as pointed out the inadvisabili 
Tession analysis for combining ea 0t bias 
Umber is high. Consequently T af meter. 

mage up by the standard error metho i nmn 

Vescibed by Gulliksen (1950) where 2 eae 

{fishted as a function of the pides e easi 
€ reliability coefficient. Variables w t 


i : jection (f 
atively with acceptance ejes, After all three 


ing P) were assigned negati ined into 
4 dices had been calculated, jos prm func- 
a 
tion, tl. Index" by means 9 I above 
M DOVE, 
As : A esults of Study I ab 
i Was i in the r i view- 
NS red not allan fer ON teevlewtt 
S, of su 
ITO ass onsequences [oan item 
iter pe Hiec I, two separate criteria of E 
5ele C&S in Study II, : Jead to the de- 
tion we, ;hich, in turn, 
Weve used, which, 


397 


velopment of two sets of scoring keys for combining 
scores, and two sets of indices. The two criteria were: 
acceptance-rejection for individual Officer Samples A, 
B, F, G; acceptance-rejection for the combined Offi- 
cer Sample A+ B -- F 4- G. The reasoning involved 
in the use of these two criteria was that, on the one 
hand, items correlated with acceptance-rejection for 
individual officers would reflect criterion variance 
specific to each officer, whereas items correlated with 
acceptance-rejection for the combined officer sample 
would reflect criterion variance common to all offi- 
cers. Depending upon the relative proportion of spe- 
cific- and common-criterion variance present in the 
data, the following results might be expected to oc- 
cur: (a) If criterion variance common to all officers 
were zero, then correlations between indices and 
acceptance rejection would be zero for the combined 
holdout samples (A+ B +F + G) and (C+D+E 
+H). In other words, the magnitude of such com- 
bined sample correlations would be indicative of cri- 
terion variance common to all officers, (b) If cri- 
terion variance specific to officers were zero, then cor- 
relations between indices and acceptance-rejection 
would be the same, regardless of whether indices 
were based upon individually based or group based 
scoring keys. In other words, correlation differences 
between indices derived from these two scoring keys 
would be indicative of criterion variance associated 
with specific officers. 

Consistent with such reasoning, the procedures 
adopted were as follows: , 

Individual scoring keys: All variables were cor- 
related with acceptance-rejection for each of the four 
officer criterion groups, namely, A, B, F, G. The 
scoring weights derived were applied to each of the 
corresponding officer holdout groups. This procedure 
was intended to maximize criterion variance Specific 
to the four officers. 

Group scoring keys: All variables were converted 
to standard score form, based upon the distribution 
of scores obtained for each officer. All variables were 
then correlated with acceptance-rejection for the 
combined criterion group, namely, A + B + F 4- G. 
The scoring weights derived were applied to the two 
holdout groups, namely, A + B-- F +G and C+D 
+E +H. This procedure was intended to minimize 
criterion variance specific to the four officers. 


Results 


Group scoring keys. 

Applicant indices: Correlations between ap- 
plicant indices and acceptance-rejection were 
.31 and .23 for the two holdout groups (see 
Column 4, Table 3). These correlations are 
statistically different from zero (p < .005). 
(Point-biserial correlations were tested for 
statistical significance by the method de- 
scribed in Ferguson, 1959.) The difference 
between these correlations was not significant, 
thus making it possible to calculate a correla- 


398 Danicl 


Syvdiaha 


TABLE 3 


CoRRELATION. (Point-Biserial) BETWEEN. ACCED 
USING INDIVIDUAL SCORING Krys AND Group SCORING KE 


ANCE-REJECTION AND THREE BALES’ INDICES, 


(Study 11) 
1 2 3 4 5 6 7 8 9 10 
Interviewer 
Applicant Interviewer Duration Three indices 
Index Index Index combined* 
Sample N ISK GSK ISK GSK ISK GSK ISK GSK 
\ 19 22 31 .39 A2 —i05 —.07 AS EI 
B 25 A0 32 -10 .55 28 .22 .68 .50 
[e 13 —b 
D 23 .20 A3 32 A2 
E 18 24 24 -18 39 
I 25 35 12 46 i32 —.33 A4 38 ES 
G 21 27 46 AT 15 —.05 .00 24 28 
H 18 A9 52 35 .68 
A+B+F+G 90 32% 31 9 34 4 08 — 46 37 
C+D+E+H 72 23 25 19 
All cases 162 28 .29 42 


? Indices were combined via discriminant function. 
b Officer C had no rejected applicants in his sample. 
* Mean correlations, 


tion based on all cases, which was .28. Dif- 
ferences among the seven officer-sample cor- 
relations shown in Column 4 of Table 3 were 
not significant (p > 9). (Correlation differ- 
ences among officer samples in this paper 


were tested by the method described in 
Snedecor, 1956.) 


Interviewer indices: Correlations between 
interviewer indices and acceptance-rejection 
were .34 and .25 for the two holdout groups 
(see Column 6, Table 3). These correlations 
are statistically different from zero (p < .005 
and .025, respectively). The difference be- 
tween these correlations was nonsignificant, 
and the correlation coefficient for all cases 
combined was .29. Differences among the 
seven officer-sample correlations shown in 
Column 6 of Table 3 were not significant 
(p > 8). 

Interview duration indices: Correlations be- 
tween the interview duration indices and ac- 
ceptance-rejection were .08 and .19 for the 
two holdout groups (see Column 8, Table 3). 
These correlations are not statistically differ- 
ent from zero, as determined by the test of 
the significance of multiple correlation, de- 
scribed by Ferguson (1959). 


Three indices combined: Correlations be 
tween the three indices combined and accept 
ance-rejection were .37 and .32 for the tW 
holdout groups (see Column 10, Table $i 
Using the test for the statistical signifan 
of multiple correlation, only the first us 
tion is different from zero (p < .01). € 
ever the two correlations are not significant? 
different, and the multiple correlation for 4 
cases combined is .35, which is nonzero ed 
< .01). Differences among the seven ow 
sample correlations shown in Column 10 
Table 3 were not significant (p > .9).. a 

The results tend to support Expectation. ^" 
With the exception of the interview durat! à 
indices, all measures were correlated with Tus 
ceptance-rejection. The scoring procedu a 
used were successful in eliminating interi” ile 
"viewer differences in interaction scores, W 
at the same time, demonstrating consiste 
interaction process common to all intervieWe 

Individual scoring keys. jon 

Applicant indices: The average correlate 
between applicant indices and acceptant os 
jection was .32 compared with .31 for in mns 
based on the group scoring keys (see Colu 
3 and 8, Table 3). 


Interaction Analysis of Selection Intervicws 


Interviewer indices: The average correla- 
tion between interviewer indices and accept- 
ance-rejection was .62 compared with .34 for 
indices based on group scoring keys (see 
Columns 5 and 6, Table 3). The question as 
to whether the difference between these cor- 
relations is significant could not be answered 
directly and a simplifying assumption was 
made in order to obtain an estimate of sta- 
tistical significance. The reason that a direct 
test of significance could not be made in this 
instance was that a correlation between the 
two sets of indices was required in order to 
test the correlation differences since both cor- 
relations were based on the same subjects 
(see Ferguson, 1959). Such a correlation be- 
tween indices was not feasible in this case 
Since there were wide differences in mean 
scores among the four officer samples based 
on the individual scoring keys. 

The simplifying assumption made was that 
the correlation between the two indices was 
zero, which represents a conservative esti- 
mate of the true state of affairs. Using this 
assumption, the difference between the two 
Correlations (.62 vs. .34) was statistically sig- 
nificant (p < .05). > 

Considering individual officer samples in 
Columns 5 and 6 in Table 3, correlation dif- 
ferences were not significant in any case, al- 
though the small sample sizes involved a 
militate against demonstrating any signi ca 


results, 


Interview duration indices: 


: ; ion indic 
Correlation between interview duration indi e. 
on was —.04 compare 


d on group scoring 


The average 


wi eteptance-rejecti 
ith .08 for indices base 
keys (see Columns 7 and 8, Table 2. ar 

Three indices combined: The oe E 
ation between the three indices com n 
acceptance-rejection was -46 vam 
37 for the indices based on t En : 
“Oring keys (see Columns 9 and 10, bare 
Sto e results obtained from the inte 


i d on 
; Ore „: Indices based 
ing: 5 tend to support Hs more criterion 


vay; dual keys account for 
Ap T Ce than do indices base 
Not Sant and interview dur 
i ; co 
LM ‘eld different results when 


With group scoring keys. 


Wit 


d on group keys. 
ation indices do 
paring in- 


399 


DiscussioN 


Hypothesis 1 was confirmed: Bales’ inter- 
action process analysis scores were correlated 
with decisions made by personnel selection 
interviewers. The correlations obtained were 
modest in magnitude, although they were uni- 
form across all interviewers, provided the data 
were transformed into standard scores. Tak- 
ing the correlation coefficient of .35 based 
upon all cases and combining all categories it 
may be concluded that approximately 1266 of 
the criterion variance is accounted for by in- 
teraction score variance common to all inter- 
viewers. 

Confirmation of Hypothesis 2 was confined 
to interviewer conversation only: interviewer 
differences in interaction scores between ac- 
cept and reject cases were found only for in- 
terviewer indices, not for the applicant and 
interview duration indices, Taking .62 as be- 
ing the correlation between interviewer indices 
and acceptance-rejection we may estimate the 
criterion variance thus accounted for to be 
38%, which presumably includes both com- 
mon and specific variance. By subtracting 
the estimated common variance (12%), men- 
tioned above, it would appear that criterion 
variance specific to interviewers is of the or- 
der of 26%. 

It should be understood that these variance 
estimates are, at best, very rough, since the 
correlations shown in Table 3 are based on 
very small samples, and there may undoubt- 
edly be differences among interviewers in the 
magnitude of criterion variance accounted for 
by specific interaction score variance. How- 
ever, as a first approximation, it may be con- 
cluded from this study that such specific in- 
teraction score variance accounts for inter- 
interviewer error to a significant degree. 

It should be pointed out that the occur- 
rence of such interviewer differences precludes 
the making of definitive statements as to 
"typical" interaction scores in employment 
interviews. While mean scores in Study II 
tended to be consistent with the results of 
Study I (“accept” interviews tended to be 
characterized by high scores in positive so- 
cial-emotional, and problem solving cate- 
gories, and by low scores in questions and 
negative social-emotional categories) these 


400 


trends were somewhat incidental. Mean scores 
varied considerably among decision categories 
as well as among interviewers, and reversals 
in the direction of mean differences were 
common. 

The question remains as to the implications 
of these findings for interview techniques. 
Clearly, this study has dealt with the assess- 
ment of the concomitants of decision making 
rather than with manipulable antecedents, 
such that any interpretation must be con- 
sidered speculative, pending confirmation by 
experimental manipulation of relevant circum- 
stances. At least two interpretations of the re- 
sults are apparent: 

l. Interaction process scores may reflect 
valid dimensions of job performance such as 
interpersonal skills. Interinterviewer differ- 
ences in interaction process scores would cor- 
respond to differences in sensitivity to such 
skills. (A follow-up is now under way in which 
interaction process scores will be correlated 
with criterion measures based on the in- 
ductees’ performance records following 3 
years regular Army service.) 

2. Depending upon whether interaction 
process scores are correlated with docu- 
mentary and other descriptive information 
presented during the interview, these results 
would throw some light on the possible im- 
portance of nonrational processes in decision 
making. To illustrate this point, let it be as- 
sumed that interaction process scores and de- 
scriptive information are uncorrelated. Then 
the fact that interaction process scores are 
correlated with acceptance-rejection would 
imply that the interviewer is sensitive to in- 
fluences which are beyond the realm of data 
gathering, and may possibly overlook infor- 
mation in reaching his decision, by reacting 
to a “favourable atmosphere.” Obviously, if 
the existence of such an atmosphere is irrele- 
vant to his task as formerly defined, then the 
interviewer may be subject to error in his 
decisions. Particularly if it could be shown 
that applicant subjects could establish a fa- 
vorable atmosphere by “role playing” cer- 
tain interaction categories, these results would 
serve to demonstrate some of the pitfalls of 
interviewing technique. 

On the other hand, if interaction process 
scores were found to be correlated with de- 


Daniel Svdiaha 


scriptive information, then the results ob- 
tained in this study would be of little conse- 
quence, since this would imply that interac- 
tion scores merely reflected the consequences 
of favorable or unfavorable information 
(whichever the case might be) divulged dur- 
ing the interview. 

The data presented here cannot settle the 
issue as to which interpretation is more plau- 
sible. One appropriate test of the alternatives 
would involve having experimental applicant 
subjects attempt to “role play” or “fake 
certain interaction categories in order to bring 
about favorable decisions on the part of in- 
terviewers. It may be worth emphasizing that 
the data discussed here are entirely consistent 
with the notion that an applicant can fake his 
way to a favorable decision. (The data are 
also consistent with the notion that interview" 
ers respond to such faking in terms of differ- 
ent interaction profiles.) But such interpret? 
tions must wait on further investigation. 


SuMMARY 

on- 
ac- 
ysis 
jons 


Samples of interview conversation (pers 
nel selection interviews) were analyzed 
cording to Bales’ interaction process anal 
Scores obtained were correlated with decis i- 
made by interviewers about whether apP 
cants were recommended for acceptance 
rejection. ine 

Tn Study I, data were obtained from eat 
interviewers (10 groups of cases ranging ysl 
9 to 15) and scores were submitted to amie 
of variance procedure. Some significant found 
ences in interaction process scores were sate 
between decision categories, and some wie 
viewer-decision interactions were also 
tistically significant. eight 

In Study II, data were obtained io ter 
Canadian Army personnel officers who y: 
viewed from 13 to 50 applicants for the ^ jes 
Correlations between interaction Cateb ng 
and acceptance-rejection were calculat: dure? 
item analysis and cross-validation Proc’ te 

The results indicated that: (a) Bales A 
action process categories are correlate ow 
acceptance-rejection, the correlation be p- 
but consistent for transformed data; roc" 
terinterviewer differences in interactio? atio” 
ess are confined to interviewer conve" 
only. 


> 


Interaction. Analysis 0f Selection Interviews 


The implications of the results are dis- 
cussed, in terms of the possible significance 
of applicant faking of favorable interactions. 


REFERENCES 


Bares, R. F. Interaction process analysis: A method 
for the study of small groups. Cambridge: Addi- 
son-Wesley, 1950. 

Brunswick, E. Perception and representative design 
of psychological experiments. (2nd ed.) Berkeley: 
Univer. California Press, 1956. 

FERGUSON, G. A. Statistical analysis in Psychology 
and education. New York: McGraw-Hill, 1959. 
Gurrrorp, J. P. Psychometric methods. (2nd ed.) 

New York: McGraw-Hill, 1954. 

GurLIKsEN, H, Theory of mental tests. New York: 

Wiley, 1950. 


401 


Hare, A. P., Bor ATTA, E. E, & Bates, R. F, (Eds.) 
Small groups: Studies in social interaction, New 
York: Knopf, 1955. 

Maccosy, ELEANOR E., Newcome, T. M. & Hanr- 
LEY, E. L. (Eds.) Readings in social psychology. 
(3rd ed.) New York: Holt, 1958. 

McNemar, Q. Psychological statistics, 
New York: Wiley, 1955, 

SNepvEcor, G. W. Statistical methods. (Sth ed.) Ames, 
Ia.: Iowa State Coll. Press, 1956. 

Swanson, G. E, Newcome, T. M, & Harttey, E. L. 
(Eds.) Readings in social psychology. (2nd ed.) 
New York: Holt, 1952. 

Svprana, D. On the equivalence of clinical and sta- 
tistical methods, J, appl. Psychol, 1959, 43, 395- 
401. 


(2nd ed.) 


(Received December 9, 1960) 


Journal of Applicd Psychology. 
1961, Vol. 45, No. 6, 402-407 


PROTECTION AGAINST IMPULSE-TYPE INDUSTRIAL 
NOISE BY UTILIZING THE ACOUSTIC REFLEX' 


JAMES A. CHISMAN anp J. RICHARD SIMON 


State University of Iowa 


In recent years industrial organizations have 
shown a growing interest in noise control. This 
increased concern with the problem of noise 
is due, in part, to recent court decisions and 
legislation favorable to hearing loss compen- 
sation for workers. Though considerable prog- 
ress has been made in controlling continuous 
noises, little has been accomplished in pro- 
tecting workers from loud impulse-type noises 
such as those produced by drop hammers, 
punch presses, dynamite, etc. The distinguish- 
ing characteristic of impulse noise is its rapid 
rise time, i.e., only a few milliseconds elapse 
between the onset and the peak intensity of 
the noise. Impulse noise is particularly haz- 
ardous because its rise time is more rapid 
than the action of the intra-aural muscles 
which normally contract to protect the ear 
against loud sounds (Metz, 1951), 

The purpose of the present experiment was 
to investigate the practicability of protecting 
the ear against industrial impulse noise by ex- 
ternally eliciting the acoustic reflex (AR) ac- 
tion of the intra-aural muscles prior to the 
presentation of the noise. The stimulus used 
to activate the acoustic reflex was a pure tone, 
hereafter referred to as an AR tone. 

It has been known for some time that cer- 
tain tones above approximately 70 db. sound 
pressure level (SPL) elicit a consensual reflex 
contraction of the two intra-aural (middle 
ear) muscles, the tensor tympani, and the 
stapedius. This response, known as the AR, 
reduces the transmission of sounds through 
the middle ear and acts to minimize possible 
cochlear damage from overstimulation by loud 
sounds (Wiggers, 1937). 

Until recently, most investigators who have 
studied the intra-aural muscles have utilized 


1 This article is based on an MS thesis done by 
Chisman under the direction of Simon. The original 
thesis is on deposit at the State University of Iowa 
Library. The authors are indebted to Scott N. Reger 
for his many helpful suggestions. 


animal subjects (Wever & Bray, 1937, 1942; 
Wever & Vernon, 1955; Wiggers, 1937). 
However, in recent years, research on human 
subjects has been stimulated by the develop- 
ment of new techniques such as the acoustic 
bridge, and by the possibility of military ap- 
plications. Metz (1951) used the acoustic 
bridge to study the contraction pattern of 
the intra-aural muscles. He found that he 
a 1,000-cycle tone of 100 db. was preven 
there was a delay of about 35 millisecon : 
before the intra-aural muscles began to um 
tract and that there was an additional 10 
millisecond period before the muscles reachet 
full contraction. The time for each of M 
phenomena was dependent on the intensity, : 
the activating tone, i.e., the more intense E 
sound, the shorter the delay and the fas Am 
the contraction. It was obvious from these en 
sults that the AR afforded very little pon 
tion against noises such as gun blasts wa 
have rapid rise times. t 

Fletcher and Riopelle (1959) and Hi 
(1959) investigated the possibility of pro fire 
ing the ear from the noise of machine £" 


by externally eliciting the AR xum 
pose 


"s 


to 100 rounds of machine gunfire, 
at a time, during a 7-minute period. 
were tested under three conditions: first; 
out an AR tone; second, with an A. aver” 
and third, wearing V-51R earplugs. T e pet 
age SPL was 120 db., and the sie a 
SPL was 132 db. For the AR pene 200 
1,000-cps tone of 98 db. was presen e tone 
milliseconds before each shot. A p on 
threshold sensitivity test was perform agai? 
each subject prior to the firing nse differ; 
within 15 seconds after the firing. «after 
ences in db. between the “before” and s COD 
firing threshold at each frequency s con 
puted. This measure is known as ugh the 
temporary threshold shift (TTS)- SEROUE 


402 


medi- ' 


Spee A y 


Protection against. Industrial Noisc 


TTS differs in several respects from permanent 
threshold shift, it is accepted by many as an 
indication of the permanent hearing loss which 
would be sustained if the subject were sub- 
jected to the same noise over a long period of 
time (Glorig, 1959; Jerger & Carhart, 1956). 
Fletcher's results indicated that protection 
by the AR tone was superior to that of the 
earplugs up to and including 1,000 cps but 
markedly inferior at and above 2,000 cps. 
The overall mean TTS was 19.23 db. with no 
protection, 6.27 db. with the AR tone, and 
2.50 db. with earplugs. In other words, the 
AR tone protected the auditory system against 
machine gunfire, but did not provide as much 
overall protection as did the earplugs. 


METHOD 


The present experiment was designed to determine 
whether Fletcher’s notion of externally eliciting the 
AR action was applicable for protecting the ear 
against industrial impact noise. This study, however, 
differs from Fletcher’s in several important respects: 

1. Fletcher used an AR tone which had a very 
rapid rise time. Therefore, the AR tone itself may 
have produced a T In the present study, the in- 
tensity of the AR activator tone was increased gradu- 
ally over a period of 250 milliseconds. This gradual 
build-up was designed to allow the intra-aural mus- 
cles time to contract, thereby preventing the AR tone 
from producing a TTS. 

2. Fletcher vestigated only one frequency of AR 
tone. To determine the relative effect of frequency 
on the AR action, two different frequencies, 250 cps 
and 1,000 cps, were investigated in the present study. 
Previous research (Simmons, 1959) had shown that 
frequencies below about 2,000 cps cause the greatest 
reduction in transmission through the middle ear. 

3. The protection afforded by earplugs was not in- 
Vestigated in this study since extensive research (Har- 
ris, 1957, Ch. 8) has already been conducted on the 
Use of ear plugs for most types of industrial noise. 


Apparatus 


A Williams mechanical drop hammer was used to 
Produce the impulse noise (average rise time, 1 milli- 
Second) for this experiment. The hammer produced 
Impacts in pairs, the two impacts in a pair being 
Separated by approximately 370 milliseconds. Here- 
after, the pair of impacts will be referred to as one 
pact, An Ampex 601 tape recorder was used to re- 
“ord 100 impacts of the hammer. To pick up the 

Oise, a Western Electric condenser microphone was 
Placed at the normal position for the operator. The 
z Merval between impacts was varied randomly from 

to 6 seconds, 
wa.) t¢Produce the noise, the tape recorder output 

as. fed through a 50-watt MacIntosh hi-fidelity 


403 


amplifier to a 15-inch Electro-Voice extended range 
speaker mounted in a Karlson enclosure.2 The im- 
pact noise was played back at 120 db. average SPL. 
This level was chosen so that the impact would be 
sufficient to produce the desired TTS and yet be be- 
low the level at which there was a danger of perma- 
nent hearing loss. The peak SPL was 139 db., and the 
background noise was 69 db.3 

The AR activator tone was set at 100 db. since 
previous research (Metz, 1951; Wever & Vernon, 
1955) suggested this to be the optimum level for AR 
action. To reduce the possibility of a TTS being pro- 
duced by the AR tone itself, the intensity of the AR 
tone was increased from 70 to 100 db. over a period 
of 250 milliseconds.! The AR tone was produced 
by an ADC 53C-audiometer whose output was fed 
through a 22-watt Bell amplifier into a 12-inch GE 
speaker mounted in a bass reflex enclosure. The 
circuitry of the audiometer was revamped to ob- 
tain the desired rise time for the AR tone. The AR 
tone was presented approximately 400-600 millisec- 
onds before each impact. This interval between tone 
and impact insured that the intra-aural muscles 
would be fully contracted at the time of the impact 
(Metz, 1951). The AR tone remained on during each 
impact. 


Subjects 


Ten college men at the State University of Iowa 
served as subjects. The criteria for choosing subjects 
were: no previous ear trouble or injuries, and 
no hearing loss greater than 15 db. between 250 and 
4,000 cps. Seven of the 10 subjects had no losses 
greater than 10 db. between 4,000 and 12,000 cps. 
Subjects ranged in age from 22 to 31 years with a 
mean age of 25 years. 


Procedure 


The subjects served in three experimental sessions: 
one in which they were exposed to the recorded im- 
pact noise, one in which they were exposed to the 
recorded impact noise plus the 250-cps AR tone, and 
one in which they were exposed to the recorded 


2 The difficulties of recording and reproducing high 
intensity transient wave forms are well known. The 
evidence suggests, however, that the equipment used 
in this experiment was adequate for the purpose. A 
Sonogram spectrum analyzer and a Tektronix oscillo- 
scope were used to compare the noise produced by 
the hammer with the output of the speaker, The 


“spectra produced by the two sources were com- 


parable, and the transient responses also compared 
favorably. 

Peak SPL was measured with a Tektronix 
oscilloscope in conjunction with a Western Electric 
condenser microphone. Average SPL and background 
noise were measured with a General Radio sound- 
level meter. 

+A preliminary check conducted on five subjects 
indicated that the AR tone with the 250-millisecond 
rise time did not produce any "TS. 


OVERALL 
MEAN 


HOLD SHIFT IN 08 
yee 


MEAN TEMPORARY THRES! 


500 1000  20CO 3000 4000 6000 8000 I 
FREQUENCY IN CPS 


(2000 


Fic. 1. Mean temporary threshold shift produced 


by the three experimental treatments. 


impact noise plus the 1,000-cps AR tone. The order 
of administration of treatments was randomized. 

At the start of each experimental section, an 
audiogram for the subject’s right ear was obtained. 
An ADC 53C audiometer calibrated in 5-db. steps was 
used to determine thresholds for nine frequencies 
between 250 and 12,000 cps. The subject was then 
seated with the speaker reproducing the impact noise 
1 foot from his right ear and the speaker repro- 
ducing the AR tone 1 foot from his left ear. The 
subject was then exposed to one of the ex- 
perimental treatments in which 100 impacts were 
presented during a 10-minute period. Finally, a post- 
exposure audiogram was obtained for the subject’s 
right car. The audiometer test was conducted im- 
mediately following the 10-minute exposure to the 
noise. The order of measuring the threshold at each 
frequency was randomized in order to control the 
possibility that the TTS might wear off during the 
audiometer test.5 The TTS was determined by com- 
paring the pre-exposure audiogram with the post- 
exposure audiogram. 

The experiment was conducted in a soundproof 
room. At least 24 hours elapsed between experimental 
sessions. A comparison of pre-exposure audiograms 
indicated that this period was sufficient to allow 
complete recovery from exposure to the noise. 


RESULTS 


Figure 1 shows the mean TTS produced by 
the three experimental treatments at each 
frequency measured. Note that the curves for 
the two AR treatments are below the curve 
for the No-AR treatment at all frequencies 
between 500 and 12,000 cps. The largest dif- 
ferences between the AR treatments and the 
No-AR treatment are at 2,000 and at 8,000 
cps. The average TTS is 5.11 db. for the 


5 Subsequent analysis indicated that order of de- 


termining postexpesure threshold had no significant 
effect. 


James A. Chisman and J. Richard Simon 


No-AR treatment, 1.44 db. for the 1,000-cps 
AR tone, and 1.11 db. for the 250-cps tone. 
Figure 1 also suggests that the impact noise 
used in this experiment produced a TTS only 
at frequencies above 250 cps. 

Table 1 summarizes the Treatment x Fre- 
quency X Subject (T x F x S) analysis of 
variance used to analyze the data. Since the 
impact noise did not produce a TTS at 250 
cps, the data for this frequency were not m 
cluded in the analysis." The significant inter- 
action between Treatments and Frequency 
(T x F) indicates that the effect of the treat- 
ments on the TTS is not independent of the 
frequency at which the TTS is produced. The 
main effects of Treatments and Frequencies 
are also significant which means that the ex 
perimental treatments produced different ee 
all effects, and exposure to the impulse 2n 
produced a different average TTS at the 
ferent frequencies. e 

Inspection of Figure 1 suggests that "n 
curves representing the two AR «€ ; 
follow a similar trend. A test of the T jd 
interaction for the two AR treatments a' 


a ut 

6 The audiometer test was purposely carried piy 
beyond the point at which a TTS might er do 
be expected to occur. This was done in "hib 
delimit more adequately the range within W s 
TTS occurred. None of the treatments prodi 
TTS at 250 cps. On a priori grounds, thereto sis 
data at 250 cps were not included in the ann ERO 
the difference between the experimental trea oth 
A decision to discard data based on Palast th: the 
than a priori grounds would obviously m0 
probability of the Type I error. 


the 


TABLE 1 
SUMMARY OF ANALYSIS OF VARIANCE FOR 
'THREE TREATMENTS Ep -— 
i à Error P 
Source SS dí MS Tem ^ 
48.0 
6 + 
Treatments (T) 997.5 2 498.8 $ 6/f 
Frequencies (F) — 292.9 7 41.6 
Subjects (S) 480.8 9 534 ; 5.9 
TXF 320.8 14 229 
FXS 392.5 63 62 
TXS 1879 18 104 
TXFXS 493.8 126 39 
Total 3166.2 239 


* Significant at the .01 level. 


Protection against. Industrial Noisc 


(see Table 2) indicates that the trends of the 
two AR treatments are, in fact, parallel. A 
test of the Treatment main effect for the two 
AR treatments is also nonsignificant, indicat- 
ing that the trends are not significantly dis- 
placed from one another. It is therefore con- 
cluded that the two AR treatments had the 
same overall effect on the subjects used in 
this experiment. It also follows from these re- 
sults that the trend of the No-AR treatment 
is not parallel to the trend of either AR treat- 
ment. 

Since the two AR treatments did not differ, 
their TTS for each frequency were averaged 
and compared to the TTS of the No-AR treat- 
ment at each frequency. Table 3 summarizes 
the results of these analyses of variance. At 
five of the frequencies, when the AR tone was 
present, the TTS was significantly less than 
that observed for the No-AR treatment. These 
frequencies are 2,000, 3,000, 4,000, 6,000, 
and 8,000 cps. There was no difference be- 
tween the AR and No-AR treatments at three 
of the frequencies; 500, 1,000, and 12,000 
cps. However, the difference between the treat- 
ments at 500 cps was just short of being sig- 
nificant. 

At the end of the experiment, the subjects 
Were asked which experimental treatment was 
More pleasing. The unanimous response was 
that the 250-cps AR tone was the most pleas- 


ing and comfortable. 


TABLE 2 


SUMMARY or ANALYSIS OF VARIANCE 
ron AR TREATMENTS ONLY 


Error 

Source SS dj MS Tem F 
l'reatimentss (T) S 1 5.7 5 05 
"equenciess (F) 644 7 92 3 1.5 
'ubjectse (s) 1384 9 154 
TX Fa 33 7 56 ,7 14 
Fxg, 392.5 63 6.2 
LX 8 1879 18 104 
XE x gy 4038 126 39 


a 
jS Ms ase two AR treat- 

“ents oS for these terms are based on the : 

m MS tor these error terms are the same as those in Table 


405 


TABLE 3 
SUMMARY OF RESULTS OF ANALYSES OF 


Frequency MS F 
500 cps 45.0 4.3 
1,000 cps 20.0 1.9 
2,000 cps 227.8 21.9** 
3,000 cps 61.2 5.9* 
4,000 cps 165.3 15.9** 
6,000 cps 112.8 10.8** 
8,000 cps 320.0 30.8** 
12,000 cps 2.8 0.3 


Note.—The error term used wasthe TX S interaction shown 
in Table 1. 

* Significant at the .05 level. 

** Significant at the .01 level. 


DISCUSSION 


Results of this study indicate that extern- 
ally eliciting the acoustic reflex is an effective 
means of protecting the auditory system 
against an industrial impact noise. Over a 
wide range of frequencies, the temporary 
threshold shift was significantly less when an 
AR tone preceded each impact than when the 
impact alone was presented. This protection 
resulted despite the fact that the total acoustic 
stimulus was greater in the AR treatments 
than in the No-AR treatment. 

No attempt was made in this study to com- 
pare the protection provided by an AR tone 
with the protection afforded by earplugs. The 
only study in which these two methods of 
protection have been compared directly indi- 
cates that earplugs afford superior protection 
at frequencies above 1,000 cps (Fletcher, 
1959). Earplugs, however, have several serious 
drawbacks. First, they make normal conversa- 
tion difficult. Second, they are often un- 
comfortable and in some cases cause infection. 
"Third, and for the above two reasons, workers 
often refuse to use earplugs. Finally, in order 
to be effective, earplugs must be fitted to the 
individual. It is often difficult to attain and 
retain the perfect seal which is essential. 

Protection by means of an AR tone has 
several obvious advantages over earplugs. 
First, AR protection operates only when 
necessary. Therefore, normal conversation 
may be carried on after the noise has termi- 
nated. Second, the AR tone, particularly the 


406 


250-cps tone, does not produce discomfort or 
unpleasantness. Third, the AR tone can be 
presented automatically. Therefore, protection 
does not depend on the cooperation of the 
worker. Finally, the AR tone can serve the 
additional function of acting as a pre-impact 
warning device for workers in the area. 

The results of this limited investigation 
when viewed in the light of the many disad- 
vantages of earplugs suggest that activating 
the acoustic reflex may be a practical and ef- 
fective way of protecting industrial operators 
from impact noises. With most industrial ma- 
chines of the impact type, the operator is con- 
fined to a limited area. It would not be 
difficult to install a tone generator near the 
operator’s ear and, by means of an electronic 
timer integrated with the machine, to control 
the time relations between the AR tone and 
the impact. The timer could also be employed 
to cut off the AR tone an instant before each 
impact, thus reducing the amount of energy 
at the moment of impact. The operator would 
still be afforded maximum protection since the 
intra-aural muscles remain fully contracted 
for about 250 milliseconds after the AR tone 
has ceased (Metz, 1951). 

One final point deserves some comment. The 
overall TTS produced in this study was sub- 
stantially less than that observed in Fletcher’s 
study although the average noise levels re- 
ported in the two studies were similar. There 
are several possible reasons for the differ- 
ing results. First, Fletcher points out that, 
because of the instruments he used, his esti- 
mates of SPL were conservative (Fletcher & 
Riopelle, 1959). The instruments used in the 
present experiment were more appropriate for 
measuring noise with a rapid rise time. Sec- 
ond, the 139-db. peak SPL reported in the 
present experiment is not comparable to the 
132-db. average peak SPL reported by 
Fletcher. In a later experiment which em- 
ployed Fletcher’s setup, (Hilding, 1960), the 
peak SPL was estimated to be 155-160 db. 
which is considerably higher than the 139-db. 
peak in the present study. Third, the noise 
spectra used in the two studies were consider- 
ably different. Fletcher’s noise spectrum was 


at the high end of the audio range, and the 
noise spectrum in the present study was at 
the low to middle end. Fourth, Fletcher pre- 


James A. Chisman and J. Richard Simon 


sented more impulses per unit of time, i.e., 100 
impulses in 7 minutes versus the 100 impulses 
in 10 minutes used in the present study. 


SUMMARY 


This experiment was designed to investi- 
gate the practicability of externally eliciting 
the acoustic reflex to protect the ear from in- 
dustrial impulse noise. Ten subjects were ex- 
posed to 100 impacts of the recorded noise of 
a mechanical drop hammer during a 10-minute 
period. The noise was presented at a 120-db. 
average SPL. Subjects listened to the same 
noise under three conditions. In Condition 1, 
the impacts were presented alone, i.e., with no 
AR tone presented. In Condition 2, a 250- 
cps, 100-db. tone was presented between 400 
and 600 milliseconds before each impact. 
Condition 3 was the same as Condition 2 eX 
cept that a 1,000-cps tone was used. Pre- 
exposure audiograms were compared with 
postexposure audiograms to determine the 
temporary threshold shift (TTS) under each 
condition. 

Analysis of the data indicated a significant 
difference in TTS between the No-AR con- 
dition and the two AR conditions at the fol 
lowing frequencies: 2,000, 3,000, 4,000, 6,000; 
and 8,000 cps. No differences were observed pr 
250, 500, 1,000, and 12,000 cps. In — 
of overall protection, the 250-cps AR tone " 
not differ from the 1,000-cps AR tone j^ 
though subjects found the 250-cps tone to 
more pleasing. 

Results are interpreted to suggest act 
elicting the acoustic reflex prior to an ian 
may be an effective means of protecting zb 
ear against industrial impulse noises. bati 
vantages of AR protection over protec 
provided by earplugs are discussed. 


t that 


REFERENCES 


" race 
Frercuer, J. L. Comparison of attenuation eat 
teristics of the acoustic reflex and the V-5 397 
plug. USA Med. Res. Lab. Rep, 1959, NO- ^ iig 
FLETCHER, J. Lọ, & RIorELLE, A. J. The pro A 
effect of acoustic reflex for impulsive noises- 
Med. Res. Lab. Rep, 1059, No. 396. aise 
Gron, A. Current research in industrial noise- 
Control, 1959, 5(1), 32-35, 74. ' 
Harris, C. M. (Ed. Handbook of noise 
New York: McGraw-Hill, 1957. 


conth ol. 


> 


Protection against Industrial Noise 


Hirono, D. A. The intratympanic muscle reflex as a 
Protective mechanism against loud impulsive noise. 
Ann. Otol. Rhinol. Laryngol., 1960, 69, 51-60. 

JERGER, J. F, & Caruart, R. Temporary threshold 
shift as an index of noise susceptibility. J. Acoust. 
Soc. Amer., 1956, 28, 611-613. g 

Merz, O. Studies on the contraction of the tympanic 
muscles as indicated by changes in the impedance 
of the ear. Acta otolaryngol., Stockh., 1951, 39, 
397-405. 

Simmons, F. B. Middle ear muscle activity at mod- 
crate sound levels. Ann. Otol. Rhinol. Laryngol., 
1959, 68, 1126-1143. 


407 


Wever, E. G., & Bray, C. W. The tensor tympani 
muscle and its relation to sound conduction. Ann. 
Otol. Rhinol. Laryngol., 1937, 46, 947-961. 

Wever, E. G., & Bray, C. W. The stapedius muscle 
in relation to sound conduction. J. exp. Psychol, 
1942, 31, 35-43. 

Wever, E. G., & Vernon, J. A. The effects of the 
tympanic muscle reflexes upon sound transmission. 
Acta otolaryngol., Stockh., 1955, 45, 433-439. 

Wiccers, H. C. The functions of the intra-aural 
muscles. Amer. J. Physiol, 1937, 120, 771-780. 


(Received December 16, 1960) 


Journal of Applied Psychology 
1961, Vol. 45, No. 6, 408-414 


A FACTOR ANALYSIS OF THE CALIFORNIA 
PSYCHOLOGICAL INVENTORY 


JOHN O. CRITES, HAROLD P. BECHTOLDT, LEONARD D. GOODSTEIN, 
AND ALFRED B. HEILBRUN, Jr. 


University of Iowa 


Although developed over a period of years 
by Gough (1957) and based in part upon re- 
search with the Minnesota Multiphasic Per- 
sonality Inventory, the California Psychologi- 
cal Inventory (CPI) is a relatively unstudied 
newcomer to the area of personality measure- 
ment. A multidimensional inventory of nor- 
mal personality characteristics, the CPI con- 
sists of 18 separate scales grouped into four 
areas of “adjustment and development.” The 
first area, which describes aspects of social 
functioning, includes measures of Dominance 
(Do), Capacity for Status (Cs), Sociability 

(Sy), Social Presence (Sp), Self-Acceptance 
(Sa), and Sense of Well-Being (Wb). The 
second area, which concerns emotional and so- 
cial development, is comprised of scales on 
Responsibility (Re), Socialization (So), Self- 
Control (Sc), Tolerance (To), Good Impres- 
sion (Gi), and Communality (Cm). The third 
area, which relates to actual and potential 
achievement in educational and occupational 
pursuits, contains measures of Achievement 
via Conformance (Ac), Achievement via In- 
dependence (Ai), and Intellectual Efficiency 
(Ie). The final area, which assesses intellec- 
tual and interest modes, has measures of Psy- 
chological-Mindedness (Py), Flexibility (Fx), 
and Femininity (Fe). The various CPI scales 
were grouped into these four areas in order to 
"emphasize some of the psychological and 
psychometric clustering which exist among 
them" (Gough, 1957, p. 7). The groupings 
were not based upon factor analysis, however, 


but rather upon inspection of intercorrelations 
among scales. 


PROBLEM 


In this study factor analysis of the CPI 
scale intercorrelations was used to accom- 
plish a number of purposes. The major intent 
of the investigation was to test Gough's sub- 
jective groupings of CPI scales against the 


more objective clusters derived from a factor 
analysis. An additional objective was to re- 
duce the 18 scales of the CPI to a more eco 
nomical set which would not only result m 
more efficient testing but which would facili- 
tate pattern analysis in the interpretation 9 
score configurations on the CPI profile. A fina 
purpose was to evaluate the extent to which 
the CPI assesses different facets of norma 
personality functioning. 


PROCEDURE 
Subjects 


A sample of 372 subjects, stratified along tw? M 
mensions, sex and client status, was available for 
factor analysis. With respect to the sex VEN 
males and females were treated separately for 
reasons: first, raw scores and hence norms is h 
two groups were different (Gough, 1957) 5 tion- 
second, it was thought that the scale interrelat i 
ships for the sexes might differ, since different re- 
role expectations for men and women proba harac 
sult in unique configuration of personality © 
teristics. With regard to client status, client dn 
client groups were selected for independent 90" ser 
because a previous study indicated that 
in both kinds and degrees of adjustment P 
(Goodstein, Crites, Heilbrun, & Rempel, 19 
group of clients, which was designated. AS 
(P), included all individuals who upon iP wing 
university counseling center checked thes to di5^ 
category on an application card: “Would | L ide er- 
cuss my feelings about myself and one seco”! 
sonal problems which are bothering Tun tional-d!- 
group of clients, which was termed “voca intak? 


on- 


cational" (VE), consisted of persons who at ould 
endorsed either or both of these statements. esi my 
like to discuss my abilities, interests, ap TOU Jike 
occupational plans for the future," and tudy t p 
to discuss my courses, grades, classan a = Follow” 
niques; my progress here at the Univer n precedi? 
ing initial application for counseling, e i cine! 
interviews with a counselor, the CPI M $ on 
administered to all clients. The nonclien a me 
trols (C), were members of a sophomo aa SI i 
level psychology course, who were tes! aple 3 d 
CPI during a class period. The total sa n 


exes a: 
Yo yield 35 | 
1956. 


was stratified, then, according to E? d 
three categories of client-nonclient 5 me ana 
groups of 62 subjects each for the fac 


408 


Factor Analysis of the California Psychological Inventory 


Design 


Scores on the 18 CPI scales for all subjects in the 
various groups were first punched on cards for use 
with the IBM 650 computer. The next step in the 
analysis was to test the homogeneity of the variance- 
covariance matrices for the six samples (Anderson, 
1958). If these were relatively homogeneous, then the 
groups could be combined for the factor analysis 
proper. The findings indicated, however, that the as- 
sumption of homogeneity was untenable. To locate 
the sources of heterogeneity, the variance-covariance 
matrices were compared for the six samples stratified 
by sex and client-nonclient status. In these compari- 
sons, there were no differences between sexes which 
were significant at the .01 level, although one was 
significant in the C group at the .05 level. There 
Were significant differences at the .01 level, however, 
between the P, VE, and C groups in all possible com- 
binations. It was decided, therefore, to combine the 
male and female data (average within-group correla- 
tions) for the client-nonclient groups and to perform 
the factor analyses only on the latter. The remaining 
Steps in the procedure involved: (a) tests of inde- 
pendence, using the determinants of the population 
estimates of the variance-covariance and correlational 
matrices (Anderson, 1958) ; (b) computation of fac- 
tors for the three client-nonclient groups, following 
the complete centroid method (Thurstone, 1947); 
(c) tests for number of significant factors based on 
Lawley's maximum-likelihood procedure (Thomson, 
1951); and (d) analytical and graphical rotation of 
the factors to an oblique simple structure (Thurstone, 
1947), 

In the selection of reference tests for each of five 
factors, two criteria were used. A set of CPI scales 
Which had the largest loadings on the factors was 
first identified, and then selections from the set were 
made, one for each factor, to minimize the number 
of overlapping items in the various scales. A sixth 


. reference test was selected to represent a dimension 


Not defined by the five-factor scales. Thus, a scale 
Chosen to represent a factor or dimension meets two 
Tequirements: (a) it has a high, although not neces- 
sarily the highest, loading on a factor; and (b) it has 
fewest number of items in common with scales 
Which represent other factors. The latter criterion 
yas necessary because the empirical approach used 
H Gough in the construction of most of the CPI 
apaes resulted in considerable item overlap, a vari- 
ble which becomes confounded with the statistical 
intetrelationships of the scales when they are corre- 
ated. Unless this lack of experimental independence 
eliminated, it is difficult to analyze the common 
‘Nance among the scales. 
Seay’ Evaluate the adequacy of the CPI reference 
tis es which were finally selected, multiple correla- 
Seale, between the selected set and the remaining 
the © Were computed. These multiple Rs indicated 
Dor Ment to which the selected scales reproduced 
Nop, ance on the unselected scales and hence were 
Purp mative of different scale groups or clusters. 
*rmore, the R's provided an estimate of the 


409 


amount of the total variance accounted for by the 
reference scales, 


RESULTS 


The tests of independence for each of the 
correlation matrices of the P, VE, and C 
groups, as presented in Table 1, were signifi- 
cant. All of the determinants for the groups 
exceeded the expected value of .267 (SE 
= .0402) at the .01 level. These findings, 
which indicated significant degrees of depend- 
ence in the matrices, supported factorial analy- 
ses of the CPI scale intercorrelations for the 
three client-nonclient groups. 

For each of the three groups, the first six 
of eight centroid factors were significant ac- 
cording to Lawley's (1951) test after iteration 
through two cycles by Bargmann's (1957) 
procedure. With 60 df, the x? values of 216.70, 
168.77, and 140.92 for the P, VE, and C 
groups, respectively, exceeded the .001 level. 
Since the residuals were less than -£.10, it 
appeared that the six factors accounted for 
most of the common variance of the CPI 
scales and that oblique rotations of five or six 
factors were justifiable. The first analytical 
approximation to a simple structure was poor, 
however, probably because many of the scales 
were not experimentally independent and 
hence formed an unusual number of doublets, 
but successive graphical rotations eventually 
yielded solutions which were more adequate 
and which were quite uniform for the three 
groups. The best solution was based upon the 
five factors listed in Table 2, although the 
communalities (4?) for the six centroid fac- 
tors indicated that unaccounted for residual 
variance remained. Note that there are some 
variations in the factor loadings across groups 
but that these are considerably less marked 
than the consistencies. 

From the factor matrices five scales, which 
had the highest factor loadings and fewest 
overlappings items, were selected as a re- 
duced set of CPI variables for a multiple re- 
gression analysis of their relationships to the 
remaining scales. The selected set of scales in- 
cluded the following: Dominance (Do), Good 
Impression (Gi), Communality (Cm), Flexi- 
bility (Fx), and Femininity (Fe). With these 
five scales as the predictor variables and the 
other 13 scales as the criterion variables, beta 


TABLE 1 

INTERCORRELATIONS OF THE CALIFORNIA PSYCHOLOGICAL INVENTORY SCALES FOR THE PERSONAL (D), 
VocaTioNaL-EpucationaL (VE), AND CoxTROL (C) Groupes 

(For each group df — 122) 


OW 


Scale Do Cs Sy Sp Sa Wb Re So Sc To Gi Cm Ac A Te Py Fs Fe Mean 
$ m 47.50 
Do VE 50.16 
` 54.96 
r 55 50.16 
Cs VE 55 50.62 
C 64 53.90 
P "o Ot 44.90 
Sy VE 70 6 50.01 
e 6 63 55.25 
p 60 o — 69 49.32 
Sp VE 56 64 73 51.09 
C 2 GQ & 57.38 
P 741 35 65 67 5440 
Sa VE 68 59 75 67 55.58 
C 66 — 48 33 68 89.53 
P 3 36 36 3 (0 40.24 
Wb VE 30 31 238 16 It 45.94 
G 2 3 41 29 0 49.76 
P 14 21 8 B -2 53 45.96 
Re VE 28 25 09 -04 -08 30 49.66 
c 27 30 3 0 -07 š 49.59 
p 00 10 13 it —14 ó 6l 40.28 
So VE 15 n 03 -02 -0 56 47 48.16 
c "n n 32 0 Q 3» Gt 80.52 
P -0 O0; 03 —1 —-30 6 66 6 40.22 
Sc VE 01 à -17 -i7 —28 68 48 59 44.54 
C  -06 -06 13 -16 -33 62 oO 3 4426 


UNAQUIH pup 'ujspoor) ';pjoog 's23147) 


411 


Factor Analysis of the California Psychological Inventory 


"poni squid peuraa(qq—-*230N7 
TOL orgi 60— SI- 10—- 4  z0 OF £0 Z= €l zi xc; 10- 90- Of LI- 90- 90- 92 
oror 960€ ¿z= z=  0£— or W- 60- 10— z- 4 OF 91 H— 9I 19  0£— @- = AA wy 
090l FIIS 10 4— 80- w- m 10— £0— o= t0 tl p Ol- 00 97- 9c— SI- II- a 
0601 06S 6l 81. sg a= o= g= pa H- 9£— @%—- 10— 80 Of 10 z=- w- 2 
$9001. orts €& e l 10- 60— £0— 8 za- cc— n- t0 $0 SL $0 Go 90 HA M 
9t11 — 9009 Z ef tf — £0— Sc— 10— z +- si- óo- TT £0 s ui ££ s0 d 
£86  PrIS TS — 0€ £t $0 zz t 4€ S837 e s % oi 6 fb — € to] 
evel eet 19 IS 9€  €0— ck O 9€ Of we +e ww oo 16 $o è ë s$% HA  Ád 
097i  9g'er 00 t0 E (d æ s ie a ff éa e of of g d 
CHO OEE 89 O L| S£ $9 tb S E Q c£ T 68 og 3 
[ra 9 19 HW OF y b Lt o 99. £t Y 00 $9 JA a 
6971 oost 60 19 87 te 9L Sb € $9 9 s gw £90 o d 
OTOL cev6 is If 22 9 0b c — 1€ SF d Ie oz 6t o 9 
zoor osos ss $0. b — 0 OF AA € E j c 9 SF — LA TA 
00Il Opec $$ 0L P T t € Lb & 6 ee oT o 8 d 
69001 FESH Z of ¢ 6 70 + æ Z sd; OF woz 2 
IS'01 — 09'Gj Ro 0 0 0 09 Æ Uu st sI 6c ce th — HA ON 
TONO SL IF fe OL 09 (60 £9 99 39 o H se æ o d 
S66 — !glc m H f£ WwW Æ g 10 90 U 60 60 5 
SIS 88'S so— æ t0 zæ 990 uu Sr £0 I $0— 10 FA wo 
6L6 — cel SI TE — 6C 6€ 6€ o z æ o ©% 6 d 
SOL TETUR ib &8 OFS 09 oz 90— 6t 10 10 2 
086 — tSt 0) 4 Sf) t 6c £0— u o 9€ 9t HA 9 
S900. 98'0f ss 08 e 6€ 9 80— 10 9; mm OF d 
6Cll 9€1c eS € Y ($9 90 (0€ 2€ (X «X 5 
066 orst 9 L S 69 % e ££ 7. 8€ "HA OL 
por ESSE 59 SF v uU s c s o p d 
qs wak oq xq Sg I NV OV UD DW OL OS oS € qM tw dg £g s) oq ELS 


ponunuo)—r[ I8 V.L 


412 


weights and Rs were obtained for the P, VE, 
and C groups combined (df — 366) since the 
variance-covariance matrices for the predic- 
‘tors were homogeneous. The residual intercor- 
relations among the criterion variables in this 
analysis revealed that Intellectual Efficiency 
(Ie) was highly related to each of them and 
moderately related to the five factor scales. 
Thus, Intellectual Efficiency was included in 
the selected set of CPI scales as a variable 


Crites, Bechtoldt, Goodstein, and Heilbrun 


which may account for part of the.oblique- 
ness among the factors as well as the sixth 
factor variance. Table 3 lists the beta weights 
and R*s for the prediction of each of the 12 
unselected scales from the combined 6 refer- 
ence scales. The R*s range from .430 to .734 
and average .542. The sum of the specific and 
common variance predicted by the selected 
set accounts for approximately 66% of the 
total variance in the 18 CPI scales. 


TABLE 2 


FINAL ROTATED OBLIQUE Factor LOADINGS OF THE C. 


ALIFORNIA PSYCHOLOGICAL INVENTORY (CPI) SCALES 
FOR THE PERSONAL (P), VoCATIONAL-EDUCATIONAL (VE), 


AND CONTROL (C) Groups 


cet Factors E ony Factors 
Scales Groups A B jo: D E ^| Scales Groups A B G D pi E 
P 08 74 —09 —09 —02 78 P 07 —06 33 si 22 8) 
Do VE 05 72 19 08 —09 79| To VE 35 27 04 38 06 82 
c 15 76 —10 —08 —04 80 C 11-05 236 48 18 6 
P 04 37 —02 39 13 67 P 61 22 —15 —04 06 7 
C; VE 235 (6 —-06 14 09 71| Gi ve — 49 25 oo ~06 03 7| 
c 06 48 0 27 15 72 c 67 —10 =12 -11 08 8 
P 04 62 01 00 20 80 P =i 0$ 39 —ii W 
Sy VE 15 77 20 1» 22 8| Cm VE  —08 —12 58 —03 08 4 
C 16 50 00 03 35 76 C -0 -11 43 06 06 90. 
R 02 4 —18 26 36 86 P 4s 33 0 o -09 7 
Sp VE 09 66 05 03 36 75| Ac VE 22 10 48 29 0070 
C  —22 29 08 40 51 84 C 38 44 o7 04 —07 
P — —05 (6 —16 08 —08 75 P 19 02 —001 70 -—0! ae 
Sa VE -13 69 24 04 06 76| Ai VE -0 04 03 70 —02 ; 
e -14 57 —05 07 06 64 c 03 07 m 65 19 
P 21 10 22 15 35 76 P o 10 6 si 2 5 
Wb VE 38 OF 30 23 10 71] te VE 20. 33 28 38 ME 
fe 2 —04 21 28 36 72 c 06 17 24 50 16 
a 0 
P i2 05 34 47 =16 74 P 33 19 ia 36 7 
Re VE 22. 11  1& 20 —31 48 | Py VE 14 26 15 38 ^ 51 
Cc 20 04 39 11 —0 66 c 26 24 —08 24 2 
20 5! 
P 17 05 32 —07 02 65 p^ = 07 -irao 0 M MR 
So VE 27 —06 53 01 —10 64 Fx VE  —13 —07 —20 43 7 81 || 
c GIO ii BR =i 3050652 c —21 —08 —12 537 ag ) 
x 35 
B 50 —-04 07 -01 11 91 P o 05 00 02 dm 39 
Sc VE 65 —15 06 03 01 92] Fe VE | —09 —08 —06 —03 D. 30 
[o 58 —10 04 06 —01 90 Go) mdi -o _ 2% (UE 


Note.—Decimal points omitted. 
* Communalities are for six centroid factors, 


Factor Analysis of the California Psychological Inventory 413 
TABLE 3 
Bera WEIGHTS AND MULTIPLE Rs FOR THE REFERENCE AND UNSELECTED CALIFORNIA 
PSYCHOLOGICAL INVENTORY SCALES BASED UPON THE COMBINED Groups 
(df = 366). 
Scale Do Gi Cm Fx Fe le R? 
E a ur 404** 068 —050 149** —060 378** 533 
Sy 558** 065 049 016 —149** 217** 554 
Sp 452** -155** —021 259** —171** 281** 533 
Sa 660** —279** —018 057 031 171** 542 
Wb 001 EY d 109** —017 —035 467** 545 
Re 011 334** 171** —152** 206** 370** 463 
So —038 294** 245** —239** 131** 329** 430 
Sc —272 7G 093** —110** 073* 263** 734 
To —026 318** 073* 108** 003 548** 593 
Ac 168** 376** 119** —090* 102** 404** 582 
Ai — 008 154** 028 303** 091* 548** 553 
Py 213** 154** —047 096* —089* 422** 436 


E. 


Note.—Decimal points omitted. 
* 05 level of significance. 
** 01 level of significance. 


DiscussroN 


The results of the factor analyses both agree 
and disagree with Gough’s clusters of scales 
for the CPI. With one exception, the Domi- 
hance scale clearly represents the Class I 
Scales which Gough terms “Measures of Poise, 
Ascendancy, and Self-Assurance" and which 
include the Capacity for Status, Sociability, 
Social Presence, and Self-Acceptance scales. 
The exception is the Well-Being scale, which 
Gough classified with these variables, but 

. Which has higher empirical relationships with 
the Responsibility, Socialization, Self-Control, 
and Tolerance scales, group of measures best 
defined by the Good Impression scale. Closely 
related to this group, but yet distinct from it, 
the Communality scale probably belongs with 
these Class II "Measures of Socialization, Ma- 
turity, and Responsibility.” In Class III, 
“Measures of Achievement Potential and In- 
tellectual Efficiency,” the Intellectual Effi- 
ciency scale correlates highly with Achieve- 
Ment via Conformance and Achievement via 

dependence, but it also relates well to most 
9f the Other scales in the CPI. The final group 

Scales, the Class IV “Measures of Intellec- 
"al and Interest Modes,” breaks down into 
paite specific factors, as assessed by the Flexi- 

ity and Femininity scales, which have only 

*8ligible relationships to each other and to 


the other measures. Thus, the findings support 
two of the four scale clusters or classes which 
Gough proposes but offer no evidence for the 
existence of the other two. 

The six scales which emerge from the factor 
and multiple regression analyses provide an 
economical but comprehensive Coverage and 
evaluation of the major aspects of the normal 
personality. Two of the Scales, Dominance 
and Good Impression, seem to measure mark- 
edly different modes of adjustment or ways of 
coping with the environment and others, The 
Dominance scale reflects an inclination to 
master and control people and things (Horney, 
1945) and a tendency to direct and manipu- 
late affairs: adjustment comes from changes 
in external reality rather than in the self. In 
contrast, the Good Impression scale 
an attitude of 
conformity (Riesman 
rives from adaptatio 
Though distinctive, 
modes appear equall 
by the positive relati 


414 


ment patterns and status with scores on two 
highly specific attitudinal and trait dimen- 
sions. Finally, the Communality scale, despite 
its relationship to the Good Impression group, 
contributes unique variance to the reference 
scale set and probably measures the tendency 
to make “modal” responses to the CPI, as 
stated in the manual (Gough, 1957). The 
scales yielded by the factor and correlational 
analyses, then, survey most of the variables 
important in the description and definition of 
the normal personality: adjustment modes, 
adjustment level, attitudes and traits, and 
test-taking set. 

That the various statistical analyses indi- 
cate that a reduced set of six scales accounts 
for about two-thirds of the total variance in 
the CPI measures strongly suggests the use 
of the reference scales for more economical 
testing and more meaningful score interpreta- 
tions. Not only are these scales relatively in- 
dependent statistically, but they are almost 
wholly unrelated experimentally. With the ex- 
ception of the Intellectual Efficiency and 
Dominance scales, which have two items in 
common, there is no item overlap among the 
six scales. As a result, the individual scales 
are more sharply defined, and pattern analy- 
ses of scores on combinations of scales are 
more clearly formulated. Since the mean 
values of the three groups tend to differ on 
these selected scales, further discriminant 
function analyses may be of value. The ulti- 
mate meaning and usefulness of the reference 
scales depends, of course, upon their relation- 
Ships to other, external variables. Research 
currently in progress deals with the problem 
of validating the reference scales and consti- 
tutes the next phase in this series of studies 
on the CPI and the measurement of the nor- 
mal personality. 


SUMMARY 


The purposes of this study were to test with 
factor analytic methods the rational groupings 
of the 18 scales of the California Psychologi- 
cal Inventory (CPI) into four areas or classes 
and to determine the number of independent 
variables measured by the scales. The sample 


Crites, Bechtoldt, Goodstein, and Heilbrun 


consisted of 372 male and female subjects 
classified into three groups: personal adjust- 
ment clients, vocational-educational clients 
from a university counseling center, and non- 
client controls from an undergraduate psy- 
chology course. The factor analyses, which 
were performed on the intercorrelations of the 
CPI scales for each group separately (df 
= 122), yielded a reduced set of six reference 
scales which exhibited homogeneous correla- 
tion matrices for all groups. These scales, 
which included measures of Dominance, Good 
Impression, Intellectual Efficiency, Flexibil- 
ity, Femininity, and Communality, repre 
sented group, general, and specific factors 
which corresponded to adjustment modes, ad- 
justment level, personal traits, and test-taking 
attitude. The findings supported some of the 
original scale clusters but not others and 1” 
dicated that the six reference scales predict 
most of the reliable variance in the other 
measures. The conclusion drawn from the } 
sults was that the reference scales provide 
more concise and less ambiguous definitions 
of the variables assessed by the CPI and have 
considerable promise as measures of the nor- 
mal personality. 


re- 


REFERENCES 


ANDERSON, T. W. Introduction to multivari 
tistical analysis, New York: Wiley, 1958. m 
BancMANN, R. A study of independence and gore 
ence normal analysis. University of North 355 
lina, Institute of Statistics, 1957, Ser. No. 
(Mimeo) " e 
Goonstetx, L. D., Crites, J. O., HEILBRUN, : ra 
Jr, & RrEMPrL, P. P. The use of the Deom 
Psychological Inventory in a university CES 
ing service, J. counsel. Psychol., 1961, 8, TA Sat 
Govcu, H. G. California Psychological Tuer ad 
manual. Palo Alto, Calif.: Consulting Psy! 
gists, 1957. w 
Horney, Karen. Neurosis and human growth. Ne 
York: Norton, 1950. : od of 
Lawrrv, D. N. The maximum likelihood meth sori 


ale sla- 


i i on 
estimating factor loadings. In G. Thoms cd.) 
The factorial analysis of human ability. (4th 
New Vork: Houghton Mifflin, 1951. piri yale 


Riesman, D. The lonely crowd. New 
Univer. Press, 1950. e icago: 

Tuunsrowr, L. L. Multiple-factor analysis. Chi 
Univer. Chicago Press, 1947. 


(Received December 23, 1960) 


A. B» 


L oM. 


rr 


Journal of Applied Psychology 


1961, Vol, 45, No. 6, 415-419 


CHANCE VERSUS NONCHANCE SCORES ON THE SVIB' 


RICHARD R. STEPHENSON * 


University of Iowa 


Scores on the Strong Vocational Interest 
Blank (SVIB) are usually reported on either 
of two specially devised profile forms, the 
Hankes Report Form or the IBM Report 


. Form. These special profiles (Strong, 1959) 


indicate which scores are significant and which are 
not, for the . . . [profiles are] printed so that all 
nonsignificant [italics added] scores must be plotted 
upon a gray background whereas significant scores 
will fall upon a white background (p. 7). 


The gray background indicates the “chance” 
range. The chance ranges were 


determined by marking 40 blanks on the basis of 
dice throwing and the shaded areas give the middle 
68 per cent of the distribution of such chance scores 
for each scale. Scores which fall in the shaded areas 
may be interpreted as easily obtainable by chance 
and therefore indeterminate for the particular scale 


in question (p. 7). 


While the profile forms may be shaded on 
the basis of the above method, the chance 
means may be more accurately located by 
taking the standard score equivalents of one- 
third of the algebraic sums of the scoring 
weights for each possible response to each 
item for each occupational scale (Strong, 
1959, p. 34). Mathematically exact standard 
deviations for the distribution of chance 
Scores have been published by Lyerly (1957). 
Thus, for most occupational scales, it is pos- 
sible to precisely delimit the chance range 
(mean, plus and minus one standard devia- 
tion) in standard score units for each occupa- 
tional scale, which makes more relevant the 
question of their significance. 


~ 

1 Based in part upon a paper presented at the 
1961 American Personnel and Guidance Association 
onvention in Denver, Colorado. 

2The writer wishes to express his gratitude to 
Leslie A. King of the General College of the Uni- 
versity of Minnesota and Mabel K. Powers of the 
enior Division, College of Science, Literature, and 
the Arts, of the University of Minnesota, for making 
Available their data without which the present re- 
Search would not have been possible; and to Richard 
t Walker, Research Assistant, for his assistance in 

* tedious task of retabulating the above data. 


If, as Strong indicates, scores which fall in 
the chance range are nonsignificant, then it be- 
comes important to assess how much of one’s 
data is “nonsignificant” for a given counsel- 
ing client and to compare this result with 
relevant normative data. It becomes even 
more important to have some estimate of the 
amount of change to be anticipated in the 
event that retest is decided upon in the hopes 
of obtaining a “significant” finding. Finally, 
and perhaps most important, if a given SVIB 
profile is composed of two different “kinds” 
of data, chance and nonchance, then it may 
be necessary to re-evaluate all of the research 
findings on SVIB stability. The implication 
in all of these questions is that if a score falls 
in the chance zone because of “chance” fac- 
tors (factors that are unknown or random), 
then, on retest, such scores might shift to any 
other location on the given occupational scale. 
Nonchance scores, on the other hand, ought 
to vary on retest only within the limits of the 
standard error of measurement of the scale in- 
volved. 


METHOD 


While the above considerations form the general 
background for the present study, the specific im- 
petus was found in a study (Stephenson, 1961) done 
on high ability, male, arts college freshmen. To ob- 
tain SVIB letter grade normative data for this group, 
such letter grades were tabulated under the Cate- 
gories A, B+, B, B— and C+, C, and Chance. The 
T and C+ letter grades were combined into a 
single category as a result of Strong’s (1943) ob- 
servation that scores falling in these categories are 
"neutral and ... can very largely be ignored" (p 
435). The Chance category was used to denote à 
score falling in the chance area. The intent was to 
exclude these nonsignificant, chance, scores from the 
analysis. However, there were more scores in the 
Chance category than in any other single category ! 
These data provided the nucleus of the normative 
data. à 

Next, SVIB data, made availabl a n 
(1956) and by King (1957), were odi Cite 
sample consisted of 109 males tested in 1931 and 
again in 1941. Most of these men were in nonpr 
fessional noncollege work, and all had a us ES 


unemployment sometime during this 10-year span 


415 


416 Richard R 


King's sample consisted of 242, generally low ability 
college freshmen, who were tested in 1954 and again, 
approximately 9 months later, in 1955. Results of 
the first testing of both Powers’ and King’s samples 
were tabulated separately according to the SVIB let- 
ter grade categories described for Stephenson. These 
data provided additional chance score normative data. 

Finally, 100 SVIB answer sheets were scored by 
use of the Rand Corporation's (1955) table of ran- 
dom numbers, a million digit series. This series is 
large enough that no digit series was used more than 
once. The first 9 digits were used to answer the Like, 
Indifferent, Dislike questions, and all 10 digits the 
Most Liked-Least Liked response sections. These 
data complete the normative study. 

To assess the relative stability of chance versus 
nonchance scores, test-retest data were necessary. As 
indicated, these were secured from Powers and King. 
Each test score (initial testing) for each subject in 
both samples was tabulated on each of 44 occupa- 
tional scales of the SVIB, using the following pro- 
cedure: Each score was first dichotomized as being 
in the chance range, using algebraically derived means, 
plus and minus mathematically exact standard devia- 
tions, or out of the chance range. This standard score 
was tabulated. The retest tabulations, then, were 
simply in terms of SVIB standard scores, and were 
computed for each subject and for each scale. This 
procedure provides an answer to the question: Do 
scores that have fallen in chance areas upon first 
testing differ more at retest than do scores that have 
originally fallen outside the chance area? 

The rationale for the above question is quite sim- 
ple: If chance is chance, or indifference, or any- 
thing but measurement, then there is no prediction 
as to where such scores will fall upon retest. On the 
other hand, for scores that originally fall outside 
the chance area, retest deviations should reflect only 
predictable errors of measurement. Nonchance scores, 
then, provide the criterion against which to answer 
the above research question. The absolute deviation 
was chosen as the appropriate measure since, under 


. Stephenson 


the assumption, both chance and nonchance scores 
will be as likely to vary in one direction as in the 
other, and an algebraic measure would, therefore, 
average to zero. 


FINDINGS 


For greater ease in making visual compari- 
son, SVIB letter grade normative data for all 
normative samples have been converted to 
percentages and are presented in Table 1. As 
can be seen in Table 1, the best guess of a 
subject’s score for any scale is chance. Even 
considering that Stephenson’s sample consists 
of high ability college males, that King’s con- 
sists primarily of low ability college males, 
and that Powers’ consists primarily of noncol- 
lege males, it seems that chance as a modal 
score is a common phenomenon. 

A common distributions chi square test of 
the three male samples in Table 1 resulted in 
a chi square of 173.104, which, with 10 df, !5 
significant far beyond the .001 level. Thus, it 
appears that, in groups that differ widely 1? 
scholastic aptitude, significant differences are 
to be expected in relative proportions foun 
within each SVIB letter grade category, ! 
cluding the category Chance. However, the 
rank-difference correlation (rho) between 
King's sample and Powers’ sample is 1.0 aD 
between either of these and Stephenson 1S 94. 

Table 2 summarizes the findings when the 
retest absolute deviations are compared. over 
SVIB occupational scales, and may be eur 
preted as indicating that, while average scale 
deviations are significantly larger after 


TABLE 1 


PERCENTAGES OF SCORES IN THE CHANCE ZONE 


oR IN ONE or THE VARIOUS LETTER GRADE 


RATINGS on 44-ScALE SVIB PROFILES FoR THREE DIFFERENT SAMPLES OF MALE PERSONS 


Blanks scored 


NM e  ——cssc Hl ŘÁD 


^ 


SVIB Noncollege College low College high with table of 
letter males* ability males? ability males? random numbers 
grade (N — 109) (N = 242) (N = 239) (W = 100) 
A 11.5 11.7 10.2 850.5 f 
B+ 8.5 9.3 9.4 3.8 
B 10.5 10.9 12.5 7.0 y - 
B-C4- 140 13.9 18.6 13.6 
C 26.4 23.7 28.3 14.3 
Chance 29.4 30.5 21.0 60.8 


Note.—The base for each percentage is the siz 4 
Siron Powers (1956) Ret testing o ee Me 
* From King (1957) first testing. 

* From Stephenson (1961). 


Chance versus Nonchance Scores on SVIB 


TABLE 2 
SUMMARY OF FINDINGS WHEN AVERAGE ABSOLUTE DEVIATIONS ARE 


COMPARED OVER SVIB 


OccuPATIONAL SCALES 


Low ability college 


subjects" Noncollege subjects” 
(N = 242) 
In Not-in In Not-in 
Variable chance chance chance chance 
N of scales 44 EE Ez H 
N of differences 3251 7397 1394 3402 
Most variable scale's 8.84* 7.791 13.08" 9.73° 
average absolute deviation 
Least variable scale's 4.17! 3.95« 4.88" 4.98: 
average absolute deviation 
"l'otal scales' average 
absolute deviations 252.46 266.32 333.77 323.02 
Average absolute deviation 
per scale 5.73 6.05 7.58 7.34 
Standard deviation of 
scale averages 1.06 .88 1.86 1.21 
oe 4 1.518 734 
p> 05 p 05 
Analysis of variance SS, = 111.7163 df= 3 
F = 21.8576 p <.01 
SSw = 293.0395 df = 172 


^ Data from King (1957). 

» Data from Powers (1956). 

e Social science high school teacher. 
d Personnel director. 

* Industrial arts teacher. 

í Banker. 

z Real estate salesman, 

b Banker. 

i Lawyer. 


years than after 9 months, for a given period 
of time there is no significant difference be- 
tween scores that fall in chance on a particu- 
lar scale and scores that fall out of chance on 
that scale. 

It should be noted that Table 2 summarizes 
findings based on four distributions of 44 
Scales each. Each scale average is based upon 
the N that happened to fall in or out of the 
Chance range on that particular scale. These 
Ns vary from 7 (Powers: in-chance, Car- 
Denter) to 206 (King: not-in-chance, Au- 
thor-Journalist, and Forest Service Man). 
F urther, the occupational scales themselves 

ave differing reliabilities. Thus, there may 
e significant individual differences in chance 
and nonchance variability that are masked 
'Y collapsing the data over scales. The ques- 
lon that now must be answered is this: Are 


there significant differences between the in- 
chance variability and the not-in-chance vari- 
ability of these individuals over these 44 oc- 
cupational scales? The data to answer this 
question are presented in summary fashion 
as Table 3. 

Table 3 may be interpreted as indicating 
that individuals are more variable after 10 
years than after 9 months, but, after this lone 
are no more variable on their originally in- 
chance scales than on their originally not-in- 
chance scales, which is the same conclusion 
reached when the data were collapsed over 
scales. 

: Note, however, that in Table 3 there is a 
significant mean difference between the in- 
chance and the not-in-chance variability of 
the two groups in King's sample. 


t o group In explain- 
ing this significant difference, 3 


at least two 


418 


Richard R. Stephenson 


TABLE 3 


SUMMARY OF FINDINGS WHEN AVERAGE ABSOLUTE DEVIATIONS ARE COMPARED OVER PERSONS 


Low ability college 


subjects? Noncollege subjects? 
In Not-in In Not-in 
Variable chance chance chance chance 
x of persons 242 242 109 109 
N of scales 44 H H 44 
Absolute deviations of g 
most variable subject 16.5 20.5 16.5 18.5 
Absolute deviations of , 
least variable subject 1.5 2.5 2.5 3.5 
Total group’s absolute ] 
deviations 1364.0 1475.0 799.5 810.5 
Average deviation per subject 5.64 6.10 7.33 7.44 
Standard deviation of 
distribution of subjects 2.02 2.20 2.63 2.84 
d 2.39 30 
02> p> .01 pd 


Analysis of variance 


SS= 3731 df= 3 


F = 22.806 p< OL 


SSw = 37944 df = 698 


* Data from King (1957). 
b Data from Powers (1956). 


thoughts come immediately to mind: First is 
that the absolute magnitude of the signifi- 
cant difference is less than one standard score 
point. This small size questions the practical 
significance of the statistical finding. Second, 
the direction of the difference is opposite to 
that predicted by the hypothesis under in- 
vestigation: that in-chance scores are the more 
variable. Thus, though new hypotheses may 
be generated by this finding, the general con- 
clusions of the present investigation, that in- 
chance scores are not more variable than not- 
in-chance scores, cannot be altered. (The 
negligible differences in means between Tables 
2 and 3 are the result of grouping errors in 
Table 3 and rounding errors in both tables.) 


DISCUSSION 


The most obvious fact to emerge from 
analysis of these data is the evidence, per- 
haps not needed by this time, of the high 
test-retest reliability of the SVIB. The largest 
mean difference observed is but seven stand- 
ard score points, and the variabilities of all 
distributions are quite small. Related to this 


relatively inconsequential test-retest variabil- 
ity is the obvious similarity of the in-chance 
and not-in-chance groups in the two samples; 
observed differences in most cases being in 
the range of one or two standard score punt 
These data, then, provide no basis Dae 
which to question the many findings on E 
test-retest reliability of the inse 
question, on the grounds that SVIB pro 
contain two different “kinds” of data. » 
Thus, while a given range on the raw E 
scale of any occupational scale may be in m 
preted as “nonsignificant,” it does not M 
reasonable to make this interpretation on am 
basis that the score arrived at that pm 
through “chance” factors. The finding P 
the modal score for these subjects í e 
“chance” score is probably an aitifact © e 
fact that most subjects do zot have intere 2 
similar to the interests of a specific occup 
tional criterion group. . t 
The implications of these findings pto 
least threefold. Most important 1s te p Ss 
lem of how to interpret "gray areas to, Ty 
or about counseling clients. This is, basica"? > 


y 


Chance versus Nonchance Scores on SVIB 


the question: Why have gray areas on profile 
sheets? While there may be some utility in 
calling attention to the zero raw score point 
on each scale (which point approximates the 
chance mean), there would seem to be more 
value in omitting this shading altogether. 
While such an omission ignores the “paradox” 


that some men in an occupational criterion 


group have negative scores on their own scales 
(Strong, 1943, p. 85), the advantage gained 
is that all scores would be interpretable in the 
Same manner, with the interpretation empha- 
Sizing the empirical derivation of the scale 
Concerned. Also, there is error at all points 
On all scales. Thus, there seems to be no spe- 
cial advantage in calling attention to one spe- 
Cial kind of error. 

The second implication concerns the role of 
Chance in pattern analysis of the SVIB. Pat- 
lern analysis by most methods, for example, 
Darley and Hagenah (1955) or Stephenson 
(1960), takes cognizance of the chance area 
as chance. If it develops that chance is not 
Chance, then pattern analysis thinking must 
be revised to either incorporate the zero raw 
Score concept or to utilize letter grade con- 
Cepts only. 

The final implication is that it is concep- 
tually not at all satisfying, in considering 
vocational development theory, occupational 
Choice theory, or vocational counseling tech- 
nique, to think in terms of chance, which can- 
not be manipulated (by definition) and can- 
Not be replicated (see the random numbers 


` column of Table 1). This is especially disturb- 


ing in view of the amount of one's interest 
test data that must be discarded if one con- 
Siders gray area chance as being in fact chance 
(a step that most investigators apparently do 
hot take in practice). 


SUMMARY 


Strong Vocational Interest Blank (SVIB) 
€tter grade normative data are presented for 
three male gróups with scores falling in the 
Chance" area considered as a separate letter 


419 


grade category. While the proportions of 
scores in the chance area differ for the three 
groups, in all groups such proportions are the 
highest of any letter grade category. Using 
the test-retest data of King (1957) and Pow- 
ers (1956), each subject’s score on each of 
44 SVIB occupational scales is tabulated as 
“in-chance” or “not-in-chance” on test. Re- 
test deviations were computed in standard 
score units regardless of direction. These ab- 
solute average deviations were compared be- 
tween King’s two groups and between Pow- 
ers’ two groups. Comparisons were made over 
scales and over persons. All results were in- 
terpreted as implying that in-chance scores, 
being no more variable than not-in-chance 
scores, should probably not be considered as 
resulting from chance, as the term is normally 
used. Other implications of these findings 
were briefly discussed. 


REFERENCES 


Dartey, J. G, & Hacenan, Treva, Vocational inter- 
est measurement. Minneapolis: Univer. Minnesota 
Press, 1955. 

Kine, L. A. Stability measures of Strong Vocational 
Interest Blank profiles. J. appl. Psychol., 1957, 41, 
143-147. 

LvrnLv, S. B. “Chance” scores on the Strong Voca- 
tional Interest Blank for men. J. appl. Psychol., 
1957, 41, 141-142. 

Powers, Maser K. Permanence of measured voca- 
tional interests of adult males. J. appl. Psychol., 
1956, 40, 69-72. 

Rawp Corroration. A million random digits with 
100,000 normal deviates. Glencoe, Ill.: Free Press, 
1955. 

STEPHENSON, R. R. Pattern analysis of the Strong 
Vocational Interest Blank for men. Amer. Psy- 
chologist, 1960, 15, 455. (Abstract) d 

STEPHENSON, R. R. Predicting SVIB profiles of high 
ability male arts college freshmen. Personnel guid 
J., 1961, 39, 650-653. 

SrRoNG, E. K., Jr. Vocational interests of men and 
women. Palo Alto: Stanford Univer. Press, 1943, 
Srronc, E. K., Jn. Manual for Strong Vocational In- 
terest Blanks for men and women. Palo Alto: 

Consulting Psychologists Press, 1959. 


(Received December 23, 1960) 


Journal of Applied Psychology 
1961, Vol. 45, No. 6, 420-427 


VISUAL TARGET LOCATION AS A FUNCTION OF NUMBER 
AND KIND OF COMPETING SIGNALS' 


DAVID M. PROMISEL * 


Johns Hopkins University 


A common problem in the design of man- 
machine systems is that of presenting infor- 
mation to the man quickly and accurately. 
The problem may be to present one informa- 
tional variable, i.e., information that varies 
along some single continuum, or several in- 
formational variables. One method of doing 
this consists of setting up a relationship be- 
tween the values, or continua, of the informa- 
tional variables and the variable aspects of a 
display. Visual displays are most useful for 
this purpose because of the number of ways 
in which a visual signal can vary. For ex- 
ample, the signals on a PPI are recognized as 
small lighted areas on the tube face. Informa- 
tional variables can be presented to a man by 
assigning meaning to, or coding, the variable 
dimensions of these Signals to represent the 
information, 

There are several Eeneral methods of ac- 
complishing this assignment. One method, 
Which can be called the nonredundant method, 
is to relate an informational variable directly 
to some one variable dimension of the sig- 
nal, e.g., the varying brightness of the above 
"small lighted area on the tube face" might 
be made to signify the size of the object the 
signal represents. Here brightness is the visual 
dimension of the signal, and size of the ob- 
ject is the informational variable. Theoreti- 
cally, as many informational variables can be 
coded as there are variable dimensions of the 
signals. A large red lighted area might repre- 
sent a large object at an altitude of 20,000 
feet, a small blue lighted area might repre- 
sent a small object at 15,000 feet, etc. 


* This research was done under Contract Nonr- 
248(55) between the Office of Naval Research and 
Johns Hopkins University. This is Report No. 15 
under that contract, Reproduction in whole or in 
part is permitted for any purpose of the United 
States Government. The author wishes to express his 
appreciation to A. Chapanis and W. R. Garner for 
their guidance during this investigation. 

2 Now at Bell Telephone Laboratories, Whippany, 
New Jersey. 


Another method is that of redundant cod- 
ing, i.e., to signify an informational yariak 
by a combination of signal dimensions eac 
of which uniquely, as well as in — 
represents the informational variable. The 
value of the informational variable being pre- 
sented can be determined by referring to any 
one of the several signal dimensions used as 
codes, e.g., a large red lighted area would Toy 
represent only the altitude of the object wi 
both large and red signifying 20,000 feet. J 

A third technique is to use a partially pe 
dundant coding system. In this case the u^ 
formational variable is again signified by , 
combination of dimensions but the value f 
this variable cannot be determined stacey 
from any one of the dimensions, as is or 
the completely redundant case. At least i 
of the dimensions does uniquely determ A 
the value but other dimensions merely m 
the possible range of the values. For omg 
given a square blue signal representing s 
altitude of an airplane, the square C000 
signify that the altitude was less than m 
feet while blue could indicate exactly 17; 
unt that may be used as Vina, 
codes include: hue, geometric ce" an 
number, length, angular PEA SU 
ness, flash rate, and stereoscopic depth. ri 
of these has its limitations in terms ? at 
number of identifiable steps within t jon 
mension, its effect upon operator sadana b 
distractability, the interference with y * 
mensions, and the speed and je es ui 
identification of the components, or và dr 
the dimension. There are also MT f 
problems involved in presenting any 

des. A 
yc n En many factors apart um 
codes themselves which affect the VÉ 
of use of a visual coding system. One M 
factor may be termed "clutter"; li it i5 
complex the visual display the har er 1953, 
to find any specific signals (Eriksen, 


420 


ee 


Visual Target Location 


1954b; French, 1954; Green & Anderson, 
1955, 1956; Reese, 1953). “Competition” is 
another factor which occurs when irrelevant 
signals on the display have some, but not all, 
dimensional values identical to the desired 
signal or target. For example, if large red sig- 
nals were being searched for, a small red sig- 
nal would be a “competing” signal. When 
performance in locating targets is deficient 
the use of redundant coding may be of value 
in improving it (Eriksen, 1952, 1953; Green 
& Anderson, 1955, 1956; Reese, 1953). 

Most of the research done in this area has 
been concerned with the coding of one infor- 
mational variable at a time and with the use 
of redundant coding. There has been little 
work dealing with the multiple coding of sig- 
nals to present more than one kind of infor- 
mation simultaneously. This experiment was 
Conducted to investigate the effect on opera- 
tor performance with multicoded visual sig- 
nals of the following variables: the number of 
targets searched for, the value of the targets 
searched for in terms of hue and shape, and 
the quantity and distribution of the compe- 
tition on the display. 


METHOD 
Materials and Apparatus 


The two visual dimensions used as codes in this 
experiment were hue and shape. Individual values of 
these dimensions can be discriminated more quickly 
and accurately than can values of other dimensions 
(es., in general, a given hue can be picked out from 
Other hues more quickly and with fewer mistakes 
than a given sized object can be picked out from 
Objects of other sizes). As components of redundant 
Codes, hue and shape, again, appear to be the best 
dimensions (Eriksen, 1932, 1953, 1954a; Reese, 1953). 

The forms used as components of the shape di- 
Mension were circles, hexagons, diamonds, triangles, 
Crosses, stars, and squares. These were selected from 
à list of shapes recommended for use in coding 
(Baker & Grether, 1954). The hues used were Gin 

unsell notation) R 5/6, YR 5/6, Y 5/6, GY 5/6, 
G 5/6, BG 5/6, and B 5/6. These represent the areas 

most sensitive color vision on the jnd Vs. wave 
length plot. The objects, or signals, consisting of all 

9 combinations of the two codes, were cut out of 

Unsell paper and glued on to white, 3" X 21" cards. 

© keep the size of the objects reasonably constant 
they were all made just circumscribable by a circle 

in diameter. ] E X 

A schematic of the display equipment is shown. in 

Bure 1, The display board was 3' square, flat white, 

Nd ruled off by black lines into 81 squares, 4” on a 


Y 
Y 


SS 
N 


Willd 
| 


\\\\\ 


Y 
Y 
77 
A 
a 
Y 
77 


N 
N 


ZA VW 


Fic. 1. Schematic of the display equipment. (The 
subject’s task was to search through the 40 signals 
located in the inner—unshaded—49 squares of the 
display board and to touch a stylus to the discs un- 
der all those which matched the sample, hung at the 
left, in both hue and shape.) 


side. The signal cards were hung on hooks located 
near the top of each of the inner 49 squares. Lo- 
cated below the card in each square was a ?"-di- 
ameter circular metal disc. The display board was 
perpendicular to subject’s (S's) line of sight and was 
mounted in a black stand so that the lower edge of 
the board was 42” from the floor, A sample of the 
target signal was hung on the stand to the left of 
the display board. The stand also contained a sliding 
panel which, when raised, hid the display from S. 
Illumination was provided by four overhead cool 
white fluorescent bulbs. 

Each oí the discs on the display was connected to 
a different symbol on an IBM input-output type- 
writer. When a stylus, attached by a long connec- 
tion to the right of the stand, was touched to a disc 
an electrical circuit was closed and the appropriate 
symbol was printed out by the typewriter. When the 
sliding panel dropped, it tripped a microswitch, start- 
ing an electric timer. The timer was stopped by S 
touching the stylus to a disc mounted on the stand 
below the sample target signal. 


Subjects 


Twenty male Ss were used. They were obtained 
from among the graduate and undergraduate stu- 
dents of Johns Hopkins University. All Ss passed a 
shortened version of the Ishihara Color Vision ‘Test 
(Chapanis, 1948). The Ss were paid at an hourly 


422 


rate and rewards were also given for the best three 
performances. 


Procedure 


For each trial 40 signal cards and nine blank white 
cards were placed in random order on the display 
board. The values of these cards were selected from 
the 49 possibilities, subject to the constraints im- 
posed by the experimental design. The sample target 
signal was put in place before the trial and remained 
there throughout the trial. The Ss were handed the 
stylus and were told that, when the sliding panel 
dropped, revealing the display and starting the timer, 
they were to touch the stylus to the discs under all 
the signals that matched the sample in both hue and 
shape. When they thought they had found all such 
signals they were to touch the stylus to the disc un- 
der the sample, thereby Stopping the timer. Ss were 
given two sample trials bu 


nals. It was explained to the Ss that pi 


erformance 
would be measured by 


ranking the Ss’ total time 


Experimental Variables 


The experiment investigated the effects on perform- 
ance of four independent variables, 

Variable VT was the value of the target signal. 
Five of the 49 hue-shape combinations were used as 
targets. These were the R circle, the 
the GY triangle, the BG 

Variable NT w: 
be located on th 


» 2, 4, 6, or 8. Inasmuch 
as the total number of signals on the board was held 
constant at 40 an increase in the number of target 
signals always corresponded to a decrease in the 
number of nontarget signals, 

Variable NC was the number (0, 8, 16, 24, or 32) 
of competing signals on the display. These were sig- 
nals identical to the target signal in either hue or 
shape but not both. Variable NC was, of course, in- 
dependent of Variable NT but an increase in com- 
peting, nontarget signals was of necessity confounded 
with a decrease in noncompeting, nontarget signals 
(ie. those signals with neither the same shape nor 
hue as the target). 

Variable DC represented the manner of distribu- 
tion of the competition between the hue and shape 
dimensions. Under condition DC, half the competing 
signals competed in terms of hue and half in terms 


of shape. Under condition DC: all the competition 
Was in the hue dimension. 


David M. Promisel 


Experimental Design 


The 20 Ss were randomly assigned to two blocks 
of 10 Ss each. These two blocks were treated identi- 
cally. Within a block the Ss were again divided ran- 
domly into two groups of five each. One group e 
given condition DCi, the other condition DCs, bu 
otherwise they, too, were treated identically. 

Within cach DC condition in each block the five 
Ss received combinations of the remaining three ven 
ables, VT, NT, and NC. Each of these variables hat 
five levels. Every S was given 25 trials, representing 
a different latin square combination of these three 
variables. These latin squares were formed in such P 
way that a design containing the Ss and any two 0 
the three variables also formed a latin square for 
each level of the third variable, This procedure Exe 
duces a complete factorial arrangement within [is 
DC conditions except that some of the interaction 
terms are confounded with subject variances. cim 
parable Ss in the two blocks, although receiving E 
same variable combinations, received these in a di 
ferent random order. 


RESULTS 


The two dependent variables in this d 
ment were time scores and error scores. T f 
time scores were subjected to an analya A 
variance for a modified random block desig! 
with fixed effects. The distribution of je 
Scores showed some mild skewness and i 
erogeneity but to transform the data s 
both to be of little value and to cumplan 
the interpretation of the data. Thus no ir" 1 
formation was made. The results of the E 
sis showed the VT, NT, NC, and NC X 

rms to be significant. : 
cos were not enough errors, 129 10 ol 
trials, to justify the use of parametric dos 
tistical tests of them. The 25 com 
errors were not subjected to any are 
analysis at all. The remainder of the er 10d 
104, were omission errors, ie. there ving 
target signals that were not discovered. sl 
total omission errors for each S as the DCs 
measure, the difference between Dt, pei 
was not significant according to the ie ot 
Whitney U test. Based on two-way variables 
the omission error scores for, the Spe 
NT, NC, and VT, and assuming no enis Sis 
tions, only the NT and VT variables ed 
nificant using the Friedman two-way & 
of variance. 


Variable VT 


; on 
Figures 2 and 3 show the relative e din 
performance of the five different hue 


Combinations used as targets. It should be 
i noted that, apart from the reversal between 
| YR hexagons and B squares and the steeper 
Slope of the omission error curve, target value 
affects performance in a similar manner 
Whether it is measured in terms of task time 
| Or in terms of the number of omission errors. 
| Figure 2 shows that, when all other conditions 
are averaged out, the R circle results in the 
quickest task time and the GY triangle the 
Slowest, 

These results with regard to target value 
àre consistent, but in themselves they give no 
"dication of the critical factor which made 
One target value better than another. Since 
With the targets actually used hue and shape 
Were completely confounded, it is impossible 
lo know whether the R circle was best be- 
Cause it was a circle or because it was red. It 
55 possible, however, to derive some reasonable 
inferences about which of these two factors is 
critical by examining some of the subcondi- 
‘Ons—subconditions which vary in the extent 
9 Which each of these two codes was required 
9 be used by S. i 

able 1 gives the rank-orders of the time 
“ores for the different target values for four 
Su Conditions, Under the NC» condition there 
Rte no Competing signals on the display. S 
May identify the targets in terms of their hue 
i Shape or both. The rank-order of the times 
Mer this subcondition is identical to the 


TASK TIME (SEC) 
œ 


5 


Eei 


R YR B BG GY 
Ci He Sq St Tr 


" TARGET VALUE 


22 £u - 
Yon Time to locate different hue-shape combina- 
| 3 Averaged over all experimental conditions. 


Visual Target Location 423 
48 
42 
36 


30 


OMISSION ERRORS 


R YR B BG GY 
Ci He Sq St Tr 


TARGET VALUE 


Fic. 3. Total omission errors made while searching 
for different hue-shape combinations, 


overall rank order as shown in Figure 2. Un- 
der Subcondition DC,, target location is de- 
pendent on both hue and shape, and in this 
case, the rank-order of the times is still the 
same. With the DC, condition, identification 
must be made in terms of shape, but hue is 
still helpful to some extent, and with this con- 
dition the rank-order of times is still corre- 
lated with those of Figure 2, but not perfectly, 
Finally, under one condition (NTsNC32DC.) 
there are eight targets and the remaining 32 
signals all compete in the hue dimension. 
Thus with this subcondition, the identifica- 
tion must be made on the basis of shape 
alone, and in this case the rank-order of times 
is completely uncorrelated with those in Fig- 
ure 2. It seems reasonable, therefore, to sup- 
pose that the rank-order of the times shown 


TABLE 1 


Target values RCi YR He BSq BGSt Gy Tr 


Shape and/or 
hue dependent 1 
Shape and hue 
dependent 1 
Shape dependent, 
hue helpful 1 4 3 2 5 
Shape dependent, 
hue not helpful 


[3 
FS 


5 


5 


ra 
N 


424 


TASK TIME (SEC.) 


NUMBER OF TARGETS 


Fic. 4. Average time to locate all the target signals 


as a function of the number 
point is the average for al 
ditions.) 


of target signals. (Each 
l other experimental con- 


in Figure 2 is caused primarily by differences 
in hue and not by differences in shape. 

The only difference between the subcondi- 
tion in which target identification is depend- 
ent completely on shape and the other three 
subconditions is the increasing dependency of 
the identification on the hue dimension. There 
is a shift in the rank-order away from that of 
the purely shape dependent subcondition as 
the dependency on hue increases. It is possible 
to verify further this description of the shift. 
Of the 25 commission errors, 23 occurred un- 
der the DC, condition. Of these, 15 occurred 
When G stars were mistaken for BG stars and 
Six occurred when G triangles were mistaken 
for YG triangles. In other words, the con- 
fusion of both the BG and YG hues with the 
G hue caused almost all of the commission 
errors. Also, Ss commented to the effect that 
hexagons and circles were relatively difficult 
to pick out, as were YG and BG. These facts 
concerning relative performance with the dif- 
ferent code values are mirrored in the rank- 
orders mentioned above. Thus, the charac- 
teristics of the hue code used in this experi- 
ment override those of the shape code in de- 
termining task time. 


David M. Promisel 


Whether conclusions regarding the codes 
used here can be extended to hue-shape codes 
in general is an open question. The lack of 
significance of any of the interactions involv- 
ing the VT variable indicates, however, that 
the results pertaining to the other independent 
variables that were obtained with these spe- 
cific codes can be stated with some generality. 


Variable NT 


Figures 4 and 5 show task time and omis- 
sion errors, respectively, as a function of the 
number of target signals on the display. The 
task time function is described as a straight 
line, justified by an analysis of the data into 
orthogonal polynomial terms that shows only 
the linear component to be significant. Al- 
though no such analysis was possible for the 
omission error function, it too appears to be 
linear increasing over at least the first four 


experiemental levels. That the drop in the. 


function when there are eight target signals 
on the display is meaningful is suggested Py 
the notion that if the value of NT were car 
ried to the extreme where all of the arer 
signals were targets, then there would Pte 
ably be no omission errors. This drop m! 


H . A unc- , 
indicate the beginning of a decreasing f 


tion. 


OMISSION ERRORS 


x cg 
Oe 
o 2 4 6 8 

NUMBER OF TARGETS 


nct 
Fic. 5. Total omission errors made as à A 
the number of targets to be found. 


ion of , 


| 


| 


Visual Target Location 42 


TASK TIME (SEC) 


[e] 8 16 24 32 


NUMBER OF 
COMPETING SIGNALS 


Fic. 6. Average time to locate targets as a function 
of the number of competing signals. (Each point is 
the average for all other experimental conditions. The 
function for the DC, condition has been drawn con- 
necting the points because it contains a quartic com- 
ponent as well as a linear component.) 


Variables NC and DC 


The effect of the number of competing sig- 
nals on task time is shown in Figure 6 for the 
Overall situation and that occurring when DC, 
and DC, are analyzed separately. The linear 
functions for the overall condition and the 
:DC» condition are again justified by the sig- 
nificance of the linear terms Whereas the 
analysis shows a significant quartic as well as 
linear component for the DC, condition. This 
quartic component indicates that there is some 
essential difference in the task required by the 
latter condition, a multicode situation, as 
Compared with the former condition, a par- 
tially redundant code situation. This task dif- 
erence does not result in a significant differ- 
pce in task time between the two conditions, 
Owever, This lack of significance is caused 
Y the fact that the proper error term for 
j sting the significance of this difference is a 
in Ween-subjects term, which was quite large 
la, tiS experiment. It seems likely that with a 
Ber number of Ss a significant difference 
uld be found. 


I 


Overall performance 


Ss’ performance was measured in terms of 
both task time and errors. The data show 
that where the independent variables affected 
the two measures, they did so in a similar 
manner for both of them. This was despite a 
tendency for the Ss to trade off speed for ac- 
curacy and vice versa. The Spearman rank 
correlation coefficient for the relationship be- 
tween the total time and error scores for each 
of the 20 Ss is —0.626. The similarity be- 
tween the two measures is apparent even 
though the Ss' trade-off tendencies serve to 
mask it out. 


DISCUSSION 


Ss’ task in this experiment required them 
to make two kinds of responses, overt and 
covert. The overt response is that actu- 
ally required in tapping the metal discs, and 
the covert response is the discrimination re- 
quired to decide which discs to tap. The NT 
variable was the only one affecting the overt 
responses, and, in terms of effect on perform- 
ance time, it was by far the most important. 
The sum of squares in the analysis of vari- 
ance for this variable was far greater, by a 
factor of 10, than those for the main effects 
of the other variables and amounted to almost 
a third of the total sums of squares. It seems 
likely that the linear effect of NT on task 
time is due primarily to the linear increase in 
the number of overt responses required as the 
number of target signals increases. The non- 
zero intercept of the function is created by 
the necessity for S to scan the display even 
when there are no targets on it. It follows 
that the most substantial savings in task time 
in this experiment could have been effected 
by designing S’s task of designating targets to 
require fewer and/or quicker overt responses, 

Variables VT, NC, and DC involved only 
covert, or discriminative, responses from S 
and had relatively small but nevertheless sig- 
nificant effects on task time. Thus, the NT 
effect is probably a composite of two func- 
tions. One is the effect produced on task time 
by NT as a result of the Overt responses re- 
quired by it; the other is the result of the 
Covert responses required. Since the overt re- 
sponses would be far more time consuming 


426 


than the covert responses, the former would 
mask out the latter. Perhaps the covert re- 
sponse function for NT is similar in shape to 
the omission error vs. NT curve, since the 
number of omission errors is affected only by 
covert responses and not at all by overt re- 
sponses. 

The NC variable affected only the difficulty 
of the perceptual discriminations, not the 
number of them, i.e., the amount of clutter 
on the display remained constant. For any 
value of NT S had to examine a constant 
number of nontarget signals. But as the value 
of NC increased, the difficulty of discrimi- 
nating the target signals from the nontarget 
signals increased. If the significance of the 
NC X DC term is to be given meaning it 
must be that the difficulty of discrimination 
is a function not only of the quantity of com- 
peting targets, but also of their quality. 

A further explanation of this function can 
be given in terms of the “effective” number 
of signals to be examined as determined by 
the manner of distribution of the competition. 
In the partially redundant case, DC, it has 
been shown (Green & Anderson, 1955, 1956) 
that the competing value effectively deter- 
mines the area of search. S, in some sense, 
only examines those signals having the com- 
peting, i.e., partially redundant, value. This 
fact would explain the linear effect obtained 
in DCs: the larger this effective area, as NC 
increases, the longer the search time. Now, if 
one assumes that for sufficiently large num- 
bers of competing signals one of the two codes 
in DC, approximates a partially redundant 
code and thus determines the effective search 
area, whereas for smaller numbers the num- 
ber of competing hue and shape signals com- 
bined determines that area, it is possible to 
generate a predicted curve with a shape simi- 
lar to the actual curve shown in F igure 6 for 
DC,. The predicted curve is based on the Ss’ 
having to search out effective areas of the fol- 
lowing numbers of signals (averaged over 
NT) as NC increases from 0 to 32: 4, 12, 20, 
16, 20. Thus, at the lower three levels of NC 
the effective area to be searched by the S in- 
cludes the targets (an average number of 
four) and all the competing signals, in both 
hue and shape. For example, at the second 
level of NC there is an average of four tar- 


David M. Promisel 


gets on the display, four signals competing in 
hue and four competing in shape, producing . 
an effective search area of 12 signals. At the 
upper two levels, the effective area of search 
would include, again, the targets plus only 
those signals which compete in one dimension 
(perhaps the signals that compete in hue). 
At the highest level, for instance, there are , 
the four targets, 16 competing hue signals 
and 16 competing shape signals. It is hy- 
pothesized, though, that S, in effect, picks 
one of the target values (e.g., target hue) and 
then limits his area of search to only those 
signals having that value. 


SUMMARY 


Performance in obtaining information from 
a double coded visual display was investi 
gated as a function of these variables: the 
number of target signals on the display, the 
values of the hue-shape combinations M 
as target signals, the number of competing p. 
signals on the display, and the distribution o 
this competition between the hue and shap! 
dimensions. be 

The number of targets, the value of "1 
targets, and the number of competing she 
nals all significantly affect task time—t 1 
number of targets having by far the greater 
effect. Both task time and errors are Qu | 
for some targets than for others, and — 
suggests that hue rather than shape 1S s 
critical factor. Both task time and errors T 
again larger for larger numbers of taro- A 
and task time alone increases with the larga, 
number of competing signals. While T 
no difference in task time between ser o> | 
double coded targets and partially dm a 
targets, the tasks themselves appear tO 
different. 


REFERENCES 


BAKER, C, & GretHer, W. Visual prese 
information. USAF WADC tech. Rep» 
54-160. r ats of 

Crapants, A. A comparative study of aye 636-649. 
color vision. J. Opt. Soc. Amer., 1948, i in cod- 

Conover, D., & Knarr, C. The use of co 1058, ds 
ing displays. USAF WADC tech. Rep, 1999, 
55-471. . . JH. 

Duncan, A. Quality control and industrial stati 
Chicago: Richard D. Irwin, 1959. ; süsl dis d 

EnrKsrN, C. W. Location of objects in a V sions 
play as a function of the number of dimens | 


| 


ntation of 
1954, No- 


Visual Target Location 


which the objects differ. J. exp. Psychol., 1952, 
44, 56-60. 

Eriksen, C. W. Object location in a complex percep- 
tual field. J. exp. Psychol., 1953, 45, 126-132. 

Ertxsen, C. W. Multidimensional stimulus differ- 
ences and accuracy of discrimination. USAF WADC 
tech. Rep., 1954, No. 54-165. (a) 

Errxsen, C. W. Partitioning and saturation of the 
perceptual field and efficiency of visual search. 
USAF WADC tech, Rep., 1954, No. 54-161. (b) 

Fisner, R. A, & Yates, F. Statistical tables for bio- 
logical, agricultural and medical research. London: 
Oliver & Boyd, 1953. 

Frencu, R. S. Pattern recognition in the presence of 
visual noise. J. exp. Psychol., 1954, 47, 27-36. 


Green, B, & ANDERSON, L. Size coding in a vis 
search task. Mass. Inst. Technol, group Rep., 1955 
No. 38-17. 

Green, B., & AnvErson, L. Color coding in a visual 
search task. J. exp. Psychol., 1956, 51, 19-24, 

Humruey, G. Thinking: An introduction to is ex- 
perimental psychology. New York: Wiley, 1951. 

Rersr, E. P, Reese, T. W., VorxkMaNN, J., & Cor- 
BIN, H. H. Psychophysical research summary re- 
port: 1946-1952. USN Spec. Dev. Cent. tech. Rep., 
1953, No. 131-1-5. 

ScugrrE, H. The analysis of variance. New York: 
Wiley, 1959. 


(Received January 12, 1961) 


l of Applied Psychology 
Tene ol 45, No. 6, 428-430 


EFFECT OF PERSONAL CHARACTERISTICS ON 
RELATIONSHIPS BETWEEN ATTITUDES 
AND JOB PERFORMANCE! 


FRANCIS D. HARDING ax» ROBERT A. BOTTENBERG 


Personnel Laboratory, Lackland Air Force Base 


Because the relationship between measure- 
ments of attitudes and job performance has 
generally been found to be low, the nature of 
research in this area has been undergoing 
change. One conclusion has been to no longer 
consider satisfaction and morale indexes as 
variables intervening between supervisory and 
organizational characteristics on the one hand 
and productivity on the other (Kahn, 1960). 
Another suggestion has been that morale be 
considered as an output variable, and at- 
tempts be made to delineate the conditions 
under which morale is related to other group 
outcomes (Whitlock & Cureton, 1960). 

This study investigates the possibility that 
by giving more consideration to character- 
istics of workers and jobs, greater relation- 
ships between attitudes and productivity will 
be found. In this regard, it has been shown 
that workers performing different kinds of 
work tend to react differently toward man- 
agement policies (Sayles, 1958). Also, that 
the amount of work experience of an indi- 
vidual and the organization to which he be- 
longs does affect his ratings of the work ac- 
tivities making up his job (Harding & Nau- 
rath, 1960). 

This report deals with a reanalysis of the 
data collected by Whitlock and Cureton in 
their investigation aimed at developing effec- 
live means for measuring morale among Air 
Force personnel. In that study no attempt was 
made to control for the influence upon an air- 
man’s attitude of his rank, age, length of serv- 
ice, career intention, or type of work per- 
formed. The purpose of the present research 
is to determine the extent to which such bio- 
graphical data affect the relationships between 
attitude and job performance. It was hy- 
pothesized that personal characteristics such 


1 The research reported in this paper was spon- 
sored by Personnel Laboratory, Aeronautical Sys- 
tems Division, under AFSC Project 7719, Task 17130. 


as rank, length of service, career intention, 
type of work, and months within the organi- 
zation when combined with airmen’s scores on 
attitude scales would contribute to the pre- 
diction of criteria such as job performance 
and absences from work. 


PROCEDURE 


The study involved the application of multiple 
regression techniques to the data collected by Whi 
lock and Cureton, Attitude scores and biographies 
data variables were used to predict criteria of Hee 
performance for 376 airmen. First, all Bredieton 
were allowed to enter the regression equation. I 
by removing certain groups of predictor variables ke 
was possible to determine their independent rd 
tribution over and above that of the variables 

aining in the equation. : y 
The penes pem in the regression analysis may 
be classified as follows: 


Attitude variables: 
1. Satisfaction 
ordinate relations zw P 

2. Satisfaction with Air Force as a mm E 
ganization—how well the Air Force is felt to 
complishing its mission X. OM 

3. Job satisfaction—interest and pride in I um 

4. Satisfaction with civilian SO ET 
to relations between local community an a wi 

5. Satisfaction with Air Force life—overa 
of being in the Air Force icis 

6. Satisfaction with management and odo 
tion—how well informed the airmen feels id m 

7. Satisfaction with unit—pride in wor 

ini i ders 
and opinion of the group’s lea â i 

8. General morale—covering everything tha 

measured by the other scales 


r : isor-sub- 
with — supervisor—supervisor-s| 


is 


Biographical variables: A a 
"à Rank of the airmen—this characteristic bed 
resented in the regression analysis by five s i um 
variables, one each to indicate membership 
icular rank category ; ns 
i Career dentia scitu or not an airman } 
tended to make a career of Air Force rd 
3. Length of service—the sample of alme ^ 
grouped into those who had less than four y! 
service and those who had more service 
4. Kind of work performed—the airmen 5 
vided into two groups: those whose work was 


was 


di- 
yere 
z osely 


428 


Personal Characteristics and Job Performance 


TABLE 1 
MULTIPLE A's OBTAINED IN PREDICTING THE Jon PERFORMANCE AND ABSENCES CRITERIA 


Criterion 
Job Job 
performance performance Estimated 
Predictors rating ranking absences 
All variables A8 A6 A4 
Biographical variables only .08 09 RU! 
Attitude-rank interaction 
variables removed At 44 07 
Linear attitude variables only 07 07 04 


related to flying and maintaining aircraft and those 
whose work was less involved in the actual opera- 
tional mission 

5. Number of months in squadron—the number of 
months the airmen had been assigned to his present 
unit 
Interaction variables: 

l. Interaction. between rank and attitude—vari- 
ables which show the scores for a particular attitude 
of airmen who hold the same rank. For each of the 
five airmen ranks there are eight variables, one for 
each of the attitude scales, The elements of such in- 
teraction variables are the attitude scores of airmen 
holding the same rank and zeros for all airmen hold- 
ing other ranks 
Criterion variables: 

1. Job performance rating—each airman was rated 
on à six-point scale by his supervisor 

2. Job performance ranking—each airman was 
ranked in respect to overall job performance by his 
supervisor 

3, Estimated absences—the number of times each 
airman had asked to be excused from duty during 
previous 3-month period for reasons other than his 
own sickness 


RESULTS AND DISCUSSION 


How much of the criterion variance is ac- 
counted for by various combinations of pre- 
dictors is shown in Table 1 by use of multiple 
R'?s. The first row shows the proportion of 
Variance accounted for when all the predic- 
tors are allowed to enter the equation. The 
Values represent correlations of .43, .41, and 
:38, respectively, for the three criteria. The 
Size of the correlations are encouraging and 
Suggest that with additional refinements the 
Variables would provide usable research tools. 

An analysis was performed to test whether 
biographical variables would contribute a sta- 


tistically significant amount of unique vari- 


ance when used with a linear combination of 
the attitude variables. This was shown to be 
the case for the two job performance cri- 
teria. These findings are in keeping with the 
hypotheses stated above and mean that if 
you are only using attitude variables to pre- 
dict job performance, you can improve your 
prediction by adding biographical variables 
such as those used in this study. 

An additional analysis was made to deter- 
mine if the attitude variables added to the 
prediction of job performance beyond that 
which was predicted by only the biographical 
variables. For example, in the present study, 
it is possible that an airman’s rank, length of 
service, career intention, or the type of work 
performed might account for the variance in 
job performance to the extent that attitude 
variables added nothing further. As shown in 
Table 1, the multiple R?, obtained when only 
biographical data variables are used as pre- 
dictors is .08 for job performance ratings and 
-09 for the job performance rankings, and for 
the criterion of estimated absences it is .04, 
When variables which reflect attitude meas- 
ures are added to the group of predictors the 
Squared multiples become 18, .16, and .14 for 
the three criteria. While appreciable, these 
increases in the multiple R*s are not sta- 
tistically significant at the 10 level. One 
possible interpretation is that the increas 
predictive efficiency is due to the use of a 
very sizeable number (40) of additional pre- 
dictor variables rather than the addition of 
independent variance. The failure of the atti- 
tude variables to make a significant independ- 
ent contribution beyond that which is con- 
tributed by the biographical variables is an 


e in 


430 


important finding. It suggests that much of 
the variance which in the past has been at- 
tributed to attitudes is held in common by 
the more easily obtained biographical vari- 
ables. 

The extent to which the attitude by rank 
interaction variables contribute to the predic- 
tion of job performance was also tested by 
a comparison of the appropriate multiple R?s, 
For each of the three criteria there was an in- 
crease in the proportion of predictable vari- 
ance when the interaction variables were 
added to biographical and linear attitude pre- 
dictors, however, none were large enough to 
be statistically significant. 


CONCLUSIONS 


The suggestion was made that one reason 
for the lack of relationship usually found þe- 
tween attitudes and job performance is the 
failure to take into account important bio- 
graphical and situational variables. It was 
hypothesized that the inclusion of such vari- 
ables in multiple regression problems would 
improve the prediction of job performance. 

Comparisons of the multiple R?s, obtained 
When several combinations of variables were 
used to predict criteria of job performance 
and absences from job permit the following 
Statements to be made: 


Francis D. Harding and Robert A. Bottenberg 


1. The contribution of attitude variables to 
the prediction made by biographical variables 
is not significant. 

2. The contribution of attitude-rank inter- 
action variables to the combination of linear 
attitude variables plus biographical variables 
is not significant. . 

3. The contribution of biographical vari- 
ables to a linear combination of attitude vari- 
ables is significant for job performance cri- 
teria but not for the estimated absences cri- 
terion. 

These findings suggest that much of the 
criterion variance associated with attitude 
measures is also related to more easily ob- 
tained biographical characteristics of workers 
and that biographical variables should be con- 
sidered when relating attitudes to produc- 
tivity. 


REFERENCES 


Haryo, F. D., & Navratu, D. A. Effects of job 
experience and organization on the rating of tasks. 
Engng. industr. Psychol., 1960, 2, 63-68. 

Kaun, R. L. Productivity and job satisfaction. Per- 
sonnel Psychol., 1960, 13, 275-287. 

Savres, L. R. Behavior of industrial work groups: 
Prediction and control. New York: Wiley, 1958. 
Warrock, G. H., & Cureton, E. E, Validation of 
morale and attitude scales. USAF WADD, tech. 

Rep., 1960, No. TR-60-76. 


(Received January 13, 1961) 


Journal oj Applied Psychology 
1961, Vol. 45, No. 6, 431-434 


THE STABILITY OF GUILFORD-ZIMMERMAN 
PERSONALITY MEASURES 


JAY M. 


Universit 


In using test scores in the selection of em- 
ployees, it is desirable that so-called measures 
of personality remain relatively stable over 
time. This is the only justification for assum- 
ing that they describe attributes of the person, 
rather than the person’s reaction to his im- 
mediate situation. Many persons, for example, 
will feel somewhat insecure in a new environ- 
ment with strangers. But some persons will 
have a greater tendency than others, and will 
require more supportive and less threatening 
situations. One would expect such persons to 
be less secure than the average regardless of 
situational changes, and their relative inse- 
curity to persist over time. 

As part of a larger study? of factors re- 
lated to turnover in two commercial offices of 
a telephone company, the Guilford-Zimmer- 
man Temperament Survey, a standardized 
personality inventory originating in factor 
analysis, was administered to 96 female em- 
ployees. A number of findings pointed to the 
relevance for performance of several of the 
traits measured by the GZTS. It was found, 
for example, that girls who were relatively 
high on Sociability and Ascendance had a 

* greater aptitude for the job, and those low 
on Security, an index constructed by the in- 
Vestigator from items of the inventory (Jack- 
Son, 1956), had considerable difficulty making 
an adjustment to the work situation. In the 
light of these and other results it was con- 
cluded that the GZTS showed promise as a 
Potential management aid in selection of per- 
Sonnel for the job being studied, if it could 

* concluded that the test was measuring 
"elatiyg], persistent characteristics of persons, 
Father than reflecting reactions to the test 
tuation, the current life situation, or tempo- 
-3ty feelin es. 

Tias g stat PEN 
& A number of papers reporting results of m proj 
Made s in the process of publication. A report was 

© the company in 1955 (see Jackson, 1955). 


JACKSON 


Y of Kansas 


An examination of the quite extensive lit- 
erature yields no evidence regarding the ques- 
tion raised: the GZTS has been correlated 
with intelligence scores (Bass, Wurster, Doll, 
& Clair, 1953; Psychometric Notes, 1954), 
college grades (Psychometric Notes, 1954), 
academic achievement (Bendig & Sprague. 
1954; Goedinghaus, 1954), leaderless group 
discussion scores (Bass, McGehee, Hawkins, 
Young, & Gebel, 1953; Bass et al., 1953), 
success in training occupational therapy stu- 
dents (Booth, 1957), personnel assessments 
(Hilton, Bolin, Parker, Taylor, & Walker, 
1954), selection and placement of engineers 
(Kirkpatrick, 1956). perseveration of set 
(Pitcher & Stacey, 1954), and other person- 
ality tests (Gilbert, 1950; Jones, 1954; Kaess 
& Witryol, 1957). Norms have been reported 
for males vs, females (Eisele & Cottle, 1951; 
Isaacson & Cottle, 1952), urban vs. rural 
(Isaacson & Cottle, 1952: Johnson & Cottle, 
1952). Negroes vs. whites (Johnson & Cottle, 
1952), and normals vs. neurotics (Franks, 
1956). But there appears to be no published 
evidence regarding the relative stability of 
the trait scores over time. 


METHOD 


The Guilford-Zim merman Temperament 


jects in the original sam 
ably. By the time of the 
subjects were considerab’ 
job training 


the- 
some subjects had 
been married, some promoted, and a few demoted, 
For many of the subjects, friends in the office had 
left and new friendships had been formed; their 
supervisors had changed at least once; and factors 
in their personal, emotional, and social lives could be 


431 


432 


assumed to have undergone change. Thus it was as- 
sumed that the psychological situations for most sub- 
jects were different for the two administrations of 
the test. 

The rationale provided the subjects for adminis- 
tering the personality inventory was that the re- 
search was attempting to determine whether a par- 
ticular type of girl was attracted to their job, or 
whether they were simply a "cross section of Ameri- 
can girls.” It was implied that they were successful 
employees, and that management wanted to be able 
to select others like them. It appeared that these in- 
structions were successful in reducing the threat for 
subjects. They were cooperative, relaxed, and the 
test situation seemed close to ideal. 


Jay M. Jackson 


RESULTS 


Scores for the two administrations of the 
GZTS are reported in Table 1. Since it is 
possible that length of employment at the 
time of initial testing might be a factor in 
whether or not changes occurred, results are 
presented separately for less experienced and 
more experienced employees. Means and stand- 
ard deviations are also presented for the total 
sample, including supervisors. There is a strik- 
ing similarity between the 1953 and 1955 
mean scores for all scales and for each cate- 
gory of the sample. The difference between 


TABLE 1 
STABILITY OF GuiLFORD-ZIMMERMAN SCORES OVER AN 18-MONTH 
PERIOD ror 49 FEMALE OFFICE WORKERS 


Less More Total 
experienced* experienced? sample? 

Scale M Sp M SD M SD 

General Activity 1953 169 46 191 6.9 182 62 
1955 15.4 55 200 6.0 17.9 6.2 

Restraint 1953 146 51 16.8 41 16.9 4.8 
1955 143 5.3 17.0 44 170 SA 

Ascendance 1953 13.0 -al 13.9 4.9 14.2 5.0 
1955 14.2 52 14.5 641 147 55 

Sociability 1953 225 46 210 54 220 55 
1955 220 43 210 64 216 — 58 

Emotional Stability 1953 18.0 44 20.2 47 20.0 4.7 
1955 186 4.2 209 5.4 204 54 

Objectivity 1953 194 2.6 19.7 3.6 19.6 45 
1955 19.3 41 195 4.5 196 4.7 

Friendliness 1953 164 3.6 170 6.2 17.7 54 
1955 15.6 38 158 5.0 167 4.9 

Thoughtfulness 1953 16.9 6.0 17.4 4.1 18.7 6.6 
1955 165 72 164 4.5 153 6 

Personal Relations 1953 203 44 21.0 5.5 21.6 54 
1955 208 48 207 56 216 58 

Masculinity-Femininity 1953 10.9 33 1L6 44 114 3.8 
1955 121 3.6 104 49 117 43 

Security? 1953 7A, iit 74 20 73 16 
1955 74 14 74 16 TA 4 


^ Subjects with less than one year's service at initial testing, N = 11. 
b Subjects with one year or more service at initial testing, N = 23. 


e Includes supervisors, N = 49, 
4 This is a 10-iter scale (see Jackson, 1956). 


Stability of Guilford-Zimmerman Measures 


43: 


TABLE 2 


PEARSON PRODUCT-MOMENT CORRELATIONS BETWEEN Two 


ADMINISTRATIONS OF THE 


GUILFORD-ZIMMERMAN TEMPERAMENT Survey, 18 Monts APART, 


FOR 49 FEMALE OFFICE WORKERS 


Less More Total 
Scale experienced experienced? sample* 
General Activity 75 87 m 
Restraint 70 -63 75 
Ascendance 83 41 41 
Sociability .91 .92 
Emotional Stability E .69 
Objectivity 71 71 
Friendliness 59 -68 
Thoughtfulness 92 £3) 
Personal Relations 37 .68 
Masculinity-Femininity —.25 252 
Security 21 50 
? With N = 11, r must be .60 for the -05 level of significance, and .74 for p = .01. 1 


» With N = 23, r must be .53 for the .01 level of significance. 


«With V = 49, r must be .36 for the ,01 level. 
4 This is a 10-item scale. 
Security scale would be .75. 


each pair of means was evaluated, using the 
customary ¢ test for matched groups. In no 
case was it large enough to warrant confidence 
that any real change had taken place. Signifi- 
cance levels ranged between .70 and 1.00, far 
from approaching an acceptable level of con- 
fidence. 

Even though mean scores on the test did 
not change significantly in the 18-month pe- 
riod, it is still possible that some scores in- 
creased and others decreased, with a com- 
pensatory effect that left the means un- 
changed. To explore this possibility, Pearson 
product-moment correlations were computed 
between the 1953 and 1955 scores. The results 
of this analysis are presented in Table 2, 

Correlations for the total sample and for 
the more experienced subjects are all so large 
that they would occur by chance much less 
often than once in a hundred times. On a few 
Of the scales for the less experienced group, 
Correlations are relatively small, indicating 
Considerable shifting in both directions from 
the original scores. Considering that there are 
only 11 persons in this category, this result 
does not appear to be too serious. Further 
*xamination of these scores indicated that the 
Small correlations resulted from large changes 
in the scores of two subjects who had been 
Married in the interval between the two test 


Using the Spearman-Brown Prophecy formula, the correlation for the total 


sample for a 30-item 


administrations. The correlations demonstrate 
substantial stability of the Guilford-Zimmer- 
man scores, and in most cases are of the same 
order of magnitude as the reliability figures 
published in the GZTS manual (Guilford & 
Zimmerman, 1949), 


CONCLUSIONS 


„The findings from this study indicate that 
within the limitations of the type and number 
of subjects involved, and to the degree that 


the psychological situations were different at 
the times of 


ered a personality 
when extreme changes occur in the life situa- 
tion of a subject, such as marriage for a quite 
young girl, some personality traits Such as 
Emotional Stability, Femininity, and Security 
may be markedly affected. On the whole how- 
ever, the scores demonstrate considerable sta- 
bility over time, and high test-retest reli- 
ability. 


REFERENCES 


434 


Bass, B. M., Wunsrrn, C. R., Dorr, P. A, & Cram, 
D. J. Situational and personality factors in leader- 
ship among sorority women. Psychol. Monogr., 
1933, 67(16, Whole No. 366). - i 

Benpic, A. W., & SPRAGUE, J. L. The Guilford-Zim- 
merman Temperament Survey as a predictor of 
achievement level and achievement fluctuation in 
introductory psychology. J. appl. Psychol., 1934, 
38, 409-413. 

Boors, M. D. A study of relationship between cer- 
tain personality factors and success in clinical train- 
ing of occupational therapy students. Amer. J. 
occup. Ther., 1957, 11, 93. 

ErseLe, M. C., & Corrie, W. C. The Guilford-Zim- 
merman Temperament Survey: I. With rural high 
school students. U. Kans. Bull. Educ., 1931, 6 
12-15. 

Franks, C. M. Conditioning and personality: A 
study of normal and neurotic subjects. J. abnorm. 
soc. Psychol., 1956, 52, 143-150. 

Grrsert, C. The Guilford-Zimmerman Temperament 
Survey and certain related personality traits. J. 
appl. Psychol., 1950, 34, 394-396. 

Goepincnavus, C. H. A study of the relationship be- 
tween temperament and academic achievement. 
Unpublished master’s thesis, University of South- 
ern California, 1954, 

Guirrosp, J. P., & Zmmmerman, W. S. The Guilford- 
Zimmerman Temperament Survey: Manual of in- 
structions and interpretations, Beverly Hills, Calif.: 
Sheridan Supply, 1949. 

Hutton, A. C, Boum, S. F, P. 
Tavrom E. K., & WALKER, 


> 


ARKER, J. W., Jr., 
W. B. The validity of 


Jay M. Jackson 


personnel assessments by professional psycholo- 
gists. J. appl. Psychol., 1954, 38, 287-293. 

Isaacson, L. E., & Corte, W. E. The Guilford- 
Zimmerman Temperament Survey: II. Urban high 
school students. U. Kans. Bull. Educ, 1952, 6, 
46-50. 

Jacxsoy, J. M. Summary of conclusions from the 
commercial department study. Ann Arbor, Mich.: 
Research Center for Group Dynamics, 1955. 
(Mimeo) 

Jackson, J. M. A Guilford-Zimmerman scale for 
psychological insecurity. Lawrence, Kan.: Author, 
1956. (Ditto) 3 

Jounson, A, & Corrie, W. E. The Guilford-Zim- 
merman Temperament Survey: III. With urban 
Negro high school students. U. Kans, Bull. Educ. 
1952, 6, 75-80. 

Joses, M. B. Aspects of the autonomous person- 
ality: IV. Traits from the Guilford-Zimmerman 
Temperament Survey. USN Sch. Aviat. Med. res. 
Rep., 1954, No. NM 001058.25.16. 

Karss, W. A, & Wirryot, S. L. Positive and nega- 
live faking on a forced-choice authoritarian scale. 
J. appl. Psychol., 1957, 41, 333-339. 

KIRKPATRICK, J. J. Validation of a test battery for 
the selection and placement of engineers. Personnel 
Psychol., 1956, 9, 211-227. mM" 

Pircner, B., & Stacey, C. L. Is Einstellung rigidity 
a general trait? J. abnorm. soc. Psychol., 1954, 
49, 3-6. " 

Psychometric notes, Number 5. Beverly Hills, Calif.: 
Sheridan Supply, 1954. 


(Received January 27, 1961) 


——ÀX- 


Journal of Applied Psychology 
1961, Vol. 45, No. 6, 435-440 


RELATIONSHIPS BETWEEN INDIVIDUAL PROFICIENCY AND 
TEAM PERFORMANCE AND EFF ICIENCY 


WILLIAM M. WIEST, LYMAN W. PORTER, ann EDWIN E. GHISELLI 


University of California, Berkeley 


Several studies have been reported compar- 
ing individual proficiency in the performance 
of a task with team proficiency. In general 
these studies have been addressed to one or 
another of the following three problems: (a) 
How does the level of performance of a team 
gompare with the performance expected on 
the basis of a simple additive combination of 
proficiency of the individuals composing the 
team? (b) How well can the performance of 
a group of individuals working cooperatively 
on a task be predicted from knowledge of 
the proficiency of individual members of the 
team? (c) Is team performance better pre- 
dicted from the individual performance of the 
more proficient members of a team or from 
the less proficient members, and how can the 
measures of individual proficiency best be 
combined to predict team performance? 

Lorge, Fox, Davitz, and Brenner (1958) 
cite several studies indicating that the sum- 
mation of individual contributions in the solv- 
ing of cognitive problems exceeds the per- 
formance which these same individuals are 
able to achieve when working as a group. 
Thus in some cases “group interaction may 
inhibit the fullest potential contribution by 
its members” (p. 354). Studies by Husband 
(1940), Marquart (1955), Shaw (1932), Tay- 
lor and Faust (1952), and Watson (1928) 
support this generalization. 

The second question has been pursued by 
Comrey (1953), Comrey and Deskin (1954a, 
1954b), and Comrey and Staats (1955) ina 
series of experiments using both motor and 
Cognitive tasks. With the motor task Comrey 
found that & surprisingly small proportion of 
the variance of the team performance was 
Predictable on the basis of the proficiency 
achieved by the members when performing 
the same task individually. In all of these ex- 
Periments using a motor task the proportion 
Of predicted team variance was less than one 
half. But with a cognitive task Comrey found 


that considerably more than half of the team 
variance was predictable on the basis of indi- 
vidual proficiency. 

The relative magnitudes of the relationship 
between team performance and the individual 
performance of the most proficient member 
of the team on the one hand, and the rela- 
tionship between team performance and the 
individual performance of the least proficient 
member on the other hand, appear to depend 
upon the nature of the interaction required 
or permitted by the task. When the group 
task requires the members to perform parts 
of the task alternately, Comrey found that 
team performance tended to have a higher 
correlation with the performance of the least 
proficient member of the team. However, 
when the group task permits greater flexibility 
of interaction among the members of the 
team and less structured team task assign- 
ments, he found team performance to be 
more highly correlated with the proficiency 
of the most proficient member of the team. 

The research to be reported here is focused 
primarily on the second and third of the three 
questions listed above. However, data relevant 
to the first question also are noted. The study 
involves two-person teams working on a cog- 
nitive task in which interaction among the 
members in the performance of the team task 
is allowed but not specified in its type or 
amount. 


MrrHOD 


Nine jigsaw puzzles of approximately equal diffi- 
culty were constructed of 4” fiberboard. Each puzzle 
contained 30 pieces and when assembled formed a 
1-foot square. The faces of the puzzles were various 
abstract designs. On the basis of a small standardi- 
zation sample the puzzles were ranked in terms of 
difficulty of solution. Being ordered in this manner, 
the puzzles could be presented in increasing order of 
difficulty to minimize the effects of practice, The 
puzzles are considered to be cognitive tasks because 
they primarily require correct judgments or decisions 
and only a minimal level of manual dexterity, That 


$ 435 


436 


is, the puzzles involve essentially problems of de- 
cision making rather than of implementation. 

The subjects were 64 pairs of male undergradu- 
ates drawn from psychology courses. Most pairs 
were composed of men who were total strangers, 
but a number were casual acquaintances and two 
pairs consisted of men who were friends. 

Four puzzles were used to ascertain each subject's 
level of proficiency on the task while working indi- 
vidually. The puzzles were briefly described to the 
subject, and the pieces were placed face up in random 
order on the table before him. The score for a 
puzzle was the time in seconds required to assemble 
it completely. One subject in each pair was given the 
four puzzles, one at a time, to assemble by himself 
while his partner filled out a personal data question- 
naire. After each subject completed his first indi- 
vidual task he exchanged positions with his partner. 
When both subjects had completed their individual 
tasks they were given the remaining five puzzles, one 
at a time, to assemble as a team, The subjects were 

allowed to talk with one another as much as they 

wished in deciding how best to proceed as a team. 

The behavior and the comments of most of the sub- 

jects indicated that they enjoyed the tasks and were 

highly motivated to perform well, 


RESULTS 


For each individual subject a time score 
was determined by averaging the times re- 
quired by him to complete the first four puz- 
zles. Reliability coefficients were estimated 
from the coefficient of correlation between 
the scores on the odd and even puzzles cor- 
rected by the Spearman-Brown formula. The 
reliability coefficient for the faster subject 
Was .85, for the slower -90, and for all sub- 
Jects together .93. 

The score for each team was computed as 
the average time required by the pair of sub- 
jects working together to complete the last 


W. M. Wiest, L. W. Porter, and E. E. Ghiselli 


five puzzles. The reliability coefficient for the 
team scores was .88. This was obtained by 
the use of the formula given by Horst (1951) 
for the situation where the two parts of a 
test are of unequal length. 

An index was computed for each group 
measuring the extent to which team perform- 
ance was superior or inferior to the perform- 
ance expected on the basis of an additive 
combination of individual skills. This index, 
designated D,o»,, is the difference between 
the observed and the predicted team time, 
and therefore is a measure of team efficiency. 
The predicted team score was arrived at by 
determining the amount of time it would have 
taken the team to finish the puzzles had both 
members worked at the same rates they did 
on the individual tasks. If there were no in- 
teraction between the members of a team, 
this index, Dop) would be zero: that is, 
there would be no difference between the team 
score predicted on the basis of the subjects’ 
performance as individuals and their actual 
observed team score. A positive value of Dio-p) 
indicates a negative or inhibitory interaction 
between the subjects when working as a team, 
meaning that the team took more time than 
expected. A negative value of D,,.5, indicates 
a positive or facilitative interaction between 
the subjects, meaning that the team took less 
time than expected. The mean value of Door) 
for the 64 groups was 18.3, which was sig- 
nificantly different from zero at less than the 
-001 level. This indicates that on the present 
task team performance was poorer than would 
have been predicted on the basis of the pro- 


TABLE 1 
Means, STANDARD DEVIATIONS, AND INTERCORRELATIONS AMONG THE VARIOUS SCORES 


Team 
Team efficiency D 
Slow Difference Sum performance Dio-p) M 5 
Fast 284* —.164 .605*** 2499 —.219 283.2 ee 
Slow Boge .936*** 628+ —.146 1508 55 
Difference .685*** 316** —.049 186.6 7.96 
Sum 190*** —.200 753.0 19 a 
Team ` .296* 189.9 2 
Diis 18.3 20. 
* p <.05. 
bad 
ib S001, 


ar. 


y 


Individual Proficiency and Team Performance 


ficiency demonstrated by the members of 
teams when working individually. Only 10 of 
the 64 pairs achieved a negative D,o-p, score, 
that is, did better than expected. Hence with 
the present task the team situation produced 
an interfering kind of interaction for most 
pairs of subjects. 

Members of each team were designated as 
"fast" or "slow" in terms of the speed with 
which they performed their individual tasks. 
Table 1 shows that the correlations between 
the individual times of both types of subjects 
and team performance are quite high, the cor- 
relation in the case of the fast subjects being 
higher than that of the slow subjects. 

'The time scores of the two members of each 
team were added together to yield a “sum” 
score, and the fast subjects time was sub- 
tracted from the slow subject's time to yield 
a “difference” score. As shown in Table 1, the 
sum is more highly related to the team score 
than is either the fast or the slow subject's in- 
dividual score. Even so, the difference score 
is significantly related to the team Score, in- 
dicating that the greater the difference be- 
tween the subjects the longer it takes them 
as a team to complete the puzzles. This rela- 
tionship is to be expected, since the greater the 
difference between the two subjects the larger 
is the sum of their performance (Table 1). 

That the slow and fast subjects’ times are 
not independent predictors of the team time 
is indicated by the significant positive correla- 
tion between their individual scores. As Com- 
rey (1953) points out, this correlation can be 
ascribed to the fact that the fast subject's 
score, the smaller numerical value, cannot be 
greater than the slow subject's score, the 
larger numerical value. This imposes a sys- 
tematic linkage between the individual scores 
of the pair, thus producing an artifactual cor- 
relation. 

As may be seen in Table 1, the scores of 
the fast subjects bear a low negative correla- 
tion with the difference scores. Such a rela- 
tionship is not surprising when it is recalled 
that all scores are time scores. This negative 
correlation means that the poorer the fast 
subject's score the less difference there is be- 
tween his score and that of his partner. The 
high positive correlation between the slow 
subject's scores and the difference scores in 


437 


part reflects the fact that the variance of the 
scores of the slow subjects is greater than the 
variance of the fast subjects. This, too, is a 
result of placing the fast members of the 
pairs in one category and their partners in 
the other. While a slow subject’s time score 
could not be less than his partner’s, poten- 
tially it could be much larger. Thus, it is 
seen that the greater the time score of the 
slow subject, the greater on the average is the 
difference between the fast and the slow sub- 
jects’ scores. In other words, the poorer the 
slow subject is, the less similar to his partner 
is he likely to be. 

The relationships between the index of 
team efficiency, D,, ,,, and all other variables 
as shown in Table 1 are quite low. While the 
negative correlations between D,,.5, and the 
scores of the fast subjects, the slow subjects, 
and the sum of the scores of both subjects 
are not significant, they suggest that the more 
highly proficient a person is, that is, the 
smaller is his time score, the more difficult it 
is for him to improve his or his partner’s per- 
formance in a team situation. The significant 
positive correlation between team time and 
D,,-», scores indicates that the better the 
team performance, that is the less time it 
takes the team to solve the puzzles, the more 
likely it is to perform better than the level 
predicted on the basis of individual profi- 
ciency. This positive correlation is not con- 
tradictory to the negative correlations be- 
tween D,,,, and individual time scores. It 
merely emphasizes the fact that fast teams 
are fast in part because the members are co- 
operating and interacting effectively, and not 
solely because the members of the team hap- 
pen to be highly proficient individually, 

The scores and measures of individual pro- 
ficiency can be combined in at least two dif- 
ferent ways in the prediction of team per- 
formance. First, the individual scores of the 
mii oh p ir ea iE MEN pair Gun e 

pie regression equation, Sec. 
ondly, the average or sum of the individual 
ent the n ed que vt 
a f e between these scores 
as a second variable, in another multiple re- 
gression equation. The first such equation 
would deal with the members of the group 
as individuals, and the second would deal 


438 


TABLE 2 
COEFFICIENTS OF MULTIPLE CORRELATION AND BETA 
WEIGHTS FOR PREDICTION OF TEAM PERFORMANCE 
AND TEAM EFFICIENCY SCORES 


Team efficiency 


Team performance — D(o-p) scores 


Prediction from 


R = .236 
indivi s s 193 
individual scores Biast i 
of fast and slow Bstow = — .091 


subjects 


Prediction from 
the sum of and 
the difference 
between the 
individual scores 
of the fast and 
slow subjects 


with them as a group. These two multiple 
regression equations are presented in Table 2; 
where both team performance and team effi- 
ciency (D, , )are the predicted variables. 

Table 2 shows first that the multiple co- 
efficients of correlation using individual scores 
to predict team performance and team effi- 
ciency are only slightly higher than the com- 
parable first-order coefficients presented in 
Table 1. The multiple coefficients of correla- 
tion are .847 for team performance and .236 
for team efficiency, while the corresponding 
first-order coefficients using the sum of indi- 
vidual scores (Table 1) are .790 and —.200, 
The beta coefficients show that the proficiency 
of the fast member contributes somewhat 
more to the prediction of team performance 
and team efficiency than does the proficiency 
of the slow member in the multiple regres- 
sion equations. 

The multiple coefficients of correlation in 
Table 2 that utilize group characteristics, the 
sum and difference Scores, are the same as 
those that involve individual characteristics. 
The former coefficients are .850 and .236 and 
the latter are .847 and .236. The beta weights 
in the correlations involving group character- 
istics show that the differences between the 
members in their individual proficiency con- 
tribute less to the multiple predictions than 
do the averages or sums of the individual 
scores. With regard to the use of the differ- 
ence scores in the prediction of the two de- 
pendent variables in Table 2, it is especially 
interesting to note that a relatively large dif- 
ference between the individuals in their per- 


W. M. Wiest, L. W. Porter, and E. E. Ghiselli 


formance as individuals will mean relatively 
good team performance but relatively poor 
team efficiency. The situation with regard to 
the sums, however, is exactly the reverse. 
Large sums indicate relatively poor team per- 
formance but relatively good team efficiency. 


DISCUSSION 


If the questions raised at the beginning of 
this article are now reconsidered, it is pos- 
sible to formulate some answers based on the 
obtained results in the present study. First, 
although it was not the primary aim of this 
investigation, the results support those ob-. 
tained in other studies that team performance 
is not as good as performance that would be 
predicted from an additive combination of in- 
dividual scores. The task of working together 
to produce a completed jigsaw puzzle is thus 
one of a variety of tasks that seems to pro- 
duce inhibitory effects when two individuals 
are performing together as a team compared 
to when they are working apart as two sepa- 
rate individuals. 

In answer to the second question of how 
well team performance can be predicted from 
a knowledge of the proficiency of individual 
members of the team, this study showed that 
relatively good prediction can be obtained 
from such information. These results are in 
accordance with those obtained previously Hy 
Comrey and his associates on certain of thei 
tasks. However, if the dependent variable 
team efficiency (defined as the relation e 
served to predicted behavior) ien 
team performance, the results obtained in i 
present investigation show that relatively a 
prediction is obtained from measures of in a 
vidual proficiency. In fact, in the prese s 
study there were no variables (neither uod 
rate measures of individual proficiency no 
their various combinations) that showed gei 
very high correlation with team Gas 
Since this is a measure of team behavior t! 
has not been used in previous similar i 
it is difficult to ascribe reasons for uw "d 
predictability of team efficiency or to de is 
those conditions under which such efficiency 
could be more highly predicted. 3 

A third question posed in the pic ges 
concerned the relative predictability of team 


4 


Individual Proficiency and Team Performance 439 


performance by the more proficient member 
of each team vs. the less proficient member. 
The findings of the present study show that 
the scores of the former type of individual, 
the more proficient member of the team, were 
slightly more predictive of both team per- 
formance and team efficiency. The fact that 
team performance was somewhat more pre- 
dictable from the faster member's individual 
score is in line with Comrey's findings for his 
cognitive task situation (Comrey & Staats, 
1955), but differs from his findings for manual 
dexterity task situations. An analysis of the 
procedures used by Comrey in his experi- 
ments and those used in the present investiga- 
tion indicates that the major factor deter- 
mining whether the faster or slower mem- 
ber's indidvuial score will be more predictive 
of team performance is not a manual task 
vs. cognitive task difference, but rather the 
amount and type of interaction allowed be- 
tween individuals when performing as a team. 
For example, in both Comrey's cognitive task 
procedure and the procedure used here, the 
faster member was allowed to do more than 
50% of the work if he wished; in Comrey's 
manual dexterity task situations, however, the 
faster member had to proceed alternately with 
the slower member, and thus could use his 
speed on only 50% of the task. Another ex- 
ample of the role of the task determined 
amount and type of interaction is that in both 


" Comrey's cognitive task and the present task 


situation the performance acts of one indi- 
vidual provide a variety of cues to the part- 
ner that may influence the speed of his per- 
formance, As an illustration, the placement 
of a key piece in the partially completed jig- 
saw puzzle will give cues as to which two or 
three other pieces can be added next. Like- 
wise, in the crossword puzzle task, the com- 
pletion of a vertical word will provide several 
cues to the partner assigned to the horizontal 
words. On the other hand, in a task situation 
Where there is simple alternation of perform- 
ance acts and rigid task assignments, the only 
cue provided by a single act is the cue for 
the next specified one to take place. 

In the present study, prediction of team 
performance and team efficiency were com- 
puted not only from scores of fast and slow 
individuals, but also from combination Scores, 


namely, sums and differences of the two indi- 
vidual scores for each team. The sum, of 
course, significantly predicted team perform- 
ance, but the difference also was à significant 
predictor of this criterion. This latter finding 
regarding the prediction provided by the dif- 
ference scores is in contrast with Comrey's 
findings in his manual dexterity task situa- 
tions. However, even in the present study, the 
difference scores still offered considerably less 
prediction of team performance than did the 
Sum scores. Both types of combinations had 
low correlations with the team efficiency 
measure, although again, the sum scores pro- 
duced somewhat higher predictability, 


SUMMARY 


The present investigation examined the re- 
lationship between the proficiency of subjects 
solving a cognitive-type task individually and 
in teams of two persons. It was found that 
team work was on the average less productive 
than the sum of individual performances, 
Team performance Was found to be better 
predicted by the individual proficiency of the 
better member than by the proficiency of the 
poorer member. A simple additive combina- 


Scores of both 


team performance, and this Prediction was 
only slightly i 
proficiencies of th 


y, the more 
a proficient and ef- 


two individuals When performi 
a team. 


REFERENCES 


Comrery, A. Dy Group performance in 
terity task, J. appl. Psychol., 
Comrey, A. L, & Deskin, G. 
group manual dexterity in 

1954, 38, 116-118, (a) 


a manual dex- 
1953, 37, 207-210. 

Further results on 
men. J. appl. Psychol., 


440 W. M. Wiest, L: W. Porter, and E. E. Ghiselli 


Comrey, A. L., & DESKIN, G. Group manual dex- ' performance and individual performance: 1920- 
terity in women. J. appl. Psychol, 1954, 38, 178— 1957. Psychol. Bull, 1958, 55, 337-372. 
180. (b) J * - “Marquart, Dorotny I. Group problem solving. J. 
Comrey, A. L., & Staats, Caroryn K. Group per- | $0c- Psychol., 1955, 41, 103-113. 
formance in a cognitive task. J. appl. Psychol., Suaw, M. E. Comparison of individuals and small 
1955, 39, 354-356. Pty * . abs A in d oe complex prob- - 
a £ n E . ems. Amer. J. Psychol., 1932, 44, 491-504. 
a epum, p pes oe! p Taytor, D. W., & Faust, W. L. Twenty questions: 
OP unedüat Iang Mes EM VEDI. SION: d Efficiency in problem solving as a function of size 


11, 368-371. A B of group. J. exp. Psychol., 1952, 44, 360-368. 
Huspanp, R. W. Cooperative versus solitary problem Watson, G. B. Do groups think more efficiently than 

solution. J. soc. Psychol., 1940, 11, 405-409. individuals? J. abnorm. soc. Psychol., 1928, 23. * 
Lorcr, J, Fox, D, Davrm, J, & Brenner, M. A 328-336. 

survey of studies contrasting the quality of group (Received February 3, 1961) 


